Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1 | %----------------------------------------------------------------------------- |
| 2 | % |
| 3 | % Thrift whitepaper |
| 4 | % |
| 5 | % Name: thrift.tex |
| 6 | % |
| 7 | % Authors: Mark Slee (mcslee@facebook.com) |
| 8 | % |
| 9 | % Created: 05 March 2007 |
| 10 | % |
David Reiss | e9aab10 | 2008-04-03 20:16:53 +0000 | [diff] [blame^] | 11 | % You will need a copy of sigplanconf.cls to format this document. |
| 12 | % It is available at <http://www.sigplan.org/authorInformation.htm>. |
| 13 | % |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 14 | %----------------------------------------------------------------------------- |
| 15 | |
| 16 | |
| 17 | \documentclass[nocopyrightspace,blockstyle]{sigplanconf} |
| 18 | |
| 19 | \usepackage{amssymb} |
| 20 | \usepackage{amsfonts} |
| 21 | \usepackage{amsmath} |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 22 | \usepackage{url} |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 23 | |
| 24 | \begin{document} |
| 25 | |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 26 | % \conferenceinfo{WXYZ '05}{date, City.} |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 27 | % \copyrightyear{2007} |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 28 | % \copyrightdata{[to be supplied]} |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 29 | |
| 30 | % \titlebanner{banner above paper title} % These are ignored unless |
| 31 | % \preprintfooter{short description of paper} % 'preprint' option specified. |
| 32 | |
| 33 | \title{Thrift: Scalable Cross-Language Services Implementation} |
| 34 | \subtitle{} |
| 35 | |
| 36 | \authorinfo{Mark Slee, Aditya Agarwal and Marc Kwiatkowski} |
| 37 | {Facebook, 156 University Ave, Palo Alto, CA} |
| 38 | {\{mcslee,aditya,marc\}@facebook.com} |
| 39 | |
| 40 | \maketitle |
| 41 | |
| 42 | \begin{abstract} |
| 43 | Thrift is a software library and set of code-generation tools developed at |
| 44 | Facebook to expedite development and implementation of efficient and scalable |
| 45 | backend services. Its primary goal is to enable efficient and reliable |
| 46 | communication across programming languages by abstracting the portions of each |
| 47 | language that tend to require the most customization into a common library |
| 48 | that is implemented in each language. Specifically, Thrift allows developers to |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 49 | define datatypes and service interfaces in a single language-neutral file |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 50 | and generate all the necessary code to build RPC clients and servers. |
| 51 | |
| 52 | This paper details the motivations and design choices we made in Thrift, as |
| 53 | well as some of the more interesting implementation details. It is not |
| 54 | intended to be taken as research, but rather it is an exposition on what we did |
| 55 | and why. |
| 56 | \end{abstract} |
| 57 | |
| 58 | % \category{D.3.3}{Programming Languages}{Language constructs and features} |
| 59 | |
| 60 | %\terms |
| 61 | %Languages, serialization, remote procedure call |
| 62 | |
| 63 | %\keywords |
| 64 | %Data description language, interface definition language, remote procedure call |
| 65 | |
| 66 | \section{Introduction} |
| 67 | As Facebook's traffic and network structure have scaled, the resource |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 68 | demands of many operations on the site (i.e. search, |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 69 | ad selection and delivery, event logging) have presented technical requirements |
| 70 | drastically outside the scope of the LAMP framework. In our implementation of |
| 71 | these services, various programming languages have been selected to |
| 72 | optimize for the right combination of performance, ease and speed of |
| 73 | development, availability of existing libraries, etc. By and large, |
| 74 | Facebook's engineering culture has tended towards choosing the best |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 75 | tools and implementations available over standardizing on any one |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 76 | programming language and begrudgingly accepting its inherent limitations. |
| 77 | |
| 78 | Given this design choice, we were presented with the challenge of building |
| 79 | a transparent, high-performance bridge across many programming languages. |
| 80 | We found that most available solutions were either too limited, did not offer |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 81 | sufficient datatype freedom, or suffered from subpar performance. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 82 | \footnote{See Appendix A for a discussion of alternative systems.} |
| 83 | |
| 84 | The solution that we have implemented combines a language-neutral software |
| 85 | stack implemented across numerous programming languages and an associated code |
| 86 | generation engine that transforms a simple interface and data definition |
| 87 | language into client and server remote procedure call libraries. |
| 88 | Choosing static code generation over a dynamic system allows us to create |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 89 | validated code that can be run without the need for |
| 90 | any advanced introspective run-time type checking. It is also designed to |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 91 | be as simple as possible for the developer, who can typically define all |
| 92 | the necessary data structures and interfaces for a complex service in a single |
| 93 | short file. |
| 94 | |
| 95 | Surprised that a robust open solution to these relatively common problems |
| 96 | did not yet exist, we committed early on to making the Thrift implementation |
| 97 | open source. |
| 98 | |
| 99 | In evaluating the challenges of cross-language interaction in a networked |
| 100 | environment, some key components were identified: |
| 101 | |
| 102 | \textit{Types.} A common type system must exist across programming languages |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 103 | without requiring that the application developer use custom Thrift datatypes |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 104 | or write their own serialization code. That is, |
| 105 | a C++ programmer should be able to transparently exchange a strongly typed |
| 106 | STL map for a dynamic Python dictionary. Neither |
| 107 | programmer should be forced to write any code below the application layer |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 108 | to achieve this. Section 2 details the Thrift type system. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 109 | |
| 110 | \textit{Transport.} Each language must have a common interface to |
| 111 | bidirectional raw data transport. The specifics of how a given |
| 112 | transport is implemented should not matter to the service developer. |
| 113 | The same application code should be able to run against TCP stream sockets, |
| 114 | raw data in memory, or files on disk. Section 3 details the Thrift Transport |
| 115 | layer. |
| 116 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 117 | \textit{Protocol.} Datatypes must have some way of using the Transport |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 118 | layer to encode and decode themselves. Again, the application |
| 119 | developer need not be concerned by this layer. Whether the service uses |
| 120 | an XML or binary protocol is immaterial to the application code. |
| 121 | All that matters is that the data can be read and written in a consistent, |
| 122 | deterministic matter. Section 4 details the Thrift Protocol layer. |
| 123 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 124 | \textit{Versioning.} For robust services, the involved datatypes must |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 125 | provide a mechanism for versioning themselves. Specifically, |
| 126 | it should be possible to add or remove fields in an object or alter the |
| 127 | argument list of a function without any interruption in service (or, |
| 128 | worse yet, nasty segmentation faults). Section 5 details Thrift's versioning |
| 129 | system. |
| 130 | |
| 131 | \textit{Processors.} Finally, we generate code capable of processing data |
Aditya Agarwal | af524ee | 2007-03-31 08:28:06 +0000 | [diff] [blame] | 132 | streams to accomplish remote procedure calls. Section 6 details the generated |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 133 | code and TProcessor paradigm. |
| 134 | |
| 135 | Section 7 discusses implementation details, and Section 8 describes |
| 136 | our conclusions. |
| 137 | |
| 138 | \section{Types} |
| 139 | |
| 140 | The goal of the Thrift type system is to enable programmers to develop using |
| 141 | completely natively defined types, no matter what programming language they |
| 142 | use. By design, the Thrift type system does not introduce any special dynamic |
| 143 | types or wrapper objects. It also does not require that the developer write |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 144 | any code for object serialization or transport. The Thrift IDL (Interface |
| 145 | Definition Language) file is |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 146 | logically a way for developers to annotate their data structures with the |
| 147 | minimal amount of extra information necessary to tell a code generator |
| 148 | how to safely transport the objects across languages. |
| 149 | |
| 150 | \subsection{Base Types} |
| 151 | |
| 152 | The type system rests upon a few base types. In considering which types to |
| 153 | support, we aimed for clarity and simplicity over abundance, focusing |
| 154 | on the key types available in all programming languages, ommitting any |
| 155 | niche types available only in specific languages. |
| 156 | |
| 157 | The base types supported by Thrift are: |
| 158 | \begin{itemize} |
| 159 | \item \texttt{bool} A boolean value, true or false |
| 160 | \item \texttt{byte} A signed byte |
| 161 | \item \texttt{i16} A 16-bit signed integer |
| 162 | \item \texttt{i32} A 32-bit signed integer |
| 163 | \item \texttt{i64} A 64-bit signed integer |
| 164 | \item \texttt{double} A 64-bit floating point number |
| 165 | \item \texttt{string} An encoding-agnostic text or binary string |
| 166 | \end{itemize} |
| 167 | |
| 168 | Of particular note is the absence of unsigned integer types. Because these |
| 169 | types have no direct translation to native primitive types in many languages, |
| 170 | the advantages they afford are lost. Further, there is no way to prevent the |
| 171 | application developer in a language like Python from assigning a negative value |
| 172 | to an integer variable, leading to unpredictable behavior. From a design |
| 173 | standpoint, we observed that unsigned integers were very rarely, if ever, used |
| 174 | for arithmetic purposes, but in practice were much more often used as keys or |
| 175 | identifiers. In this case, the sign is irrelevant. Signed integers serve this |
| 176 | same purpose and can be safely cast to their unsigned counterparts (most |
| 177 | commonly in C++) when absolutely necessary. |
| 178 | |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 179 | \subsection{Structs} |
| 180 | |
Aditya Agarwal | af524ee | 2007-03-31 08:28:06 +0000 | [diff] [blame] | 181 | A Thrift struct defines a common object to be used across languages. A struct |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 182 | is essentially equivalent to a class in object oriented programming |
| 183 | languages. A struct has a set of strongly typed fields, each with a unique |
| 184 | name identifier. The basic syntax for defining a Thrift struct looks very |
| 185 | similar to a C struct definition. Fields may be annotated with an integer field |
| 186 | identifier (unique to the scope of that struct) and optional default values. |
| 187 | Field identifiers will be automatically assigned if omitted, though they are |
| 188 | strongly encouraged for versioning reasons discussed later. |
| 189 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 190 | \subsection{Containers} |
| 191 | |
| 192 | Thrift containers are strongly typed containers that map to the most commonly |
| 193 | used containers in common programming languages. They are annotated using |
| 194 | the C++ template (or Java Generics) style. There are three types available: |
| 195 | \begin{itemize} |
| 196 | \item \texttt{list<type>} An ordered list of elements. Translates directly into |
| 197 | an STL \texttt{vector}, Java \texttt{ArrayList}, or native array in scripting languages. May |
| 198 | contain duplicates. |
| 199 | \item \texttt{set<type>} An unordered set of unique elements. Translates into |
| 200 | an STL \texttt{set}, Java \texttt{HashSet}, \texttt{set} in Python, or native |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 201 | dictionary in PHP/Ruby. |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 202 | \item \texttt{map<type1,type2>} A map of strictly unique keys to values |
Mark Slee | 09bfd61 | 2007-04-04 19:56:41 +0000 | [diff] [blame] | 203 | Translates into an STL \texttt{map}, Java \texttt{HashMap}, PHP associative |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 204 | array, or Python/Ruby dictionary. |
| 205 | \end{itemize} |
| 206 | |
| 207 | While defaults are provided, the type mappings are not explicitly fixed. Custom |
| 208 | code generator directives have been added to substitute custom types in |
| 209 | destination languages (i.e. |
| 210 | \texttt{hash\_map} or Google's sparse hash map can be used in C++). The |
| 211 | only requirement is that the custom types support all the necessary iteration |
| 212 | primitives. Container elements may be of any valid Thrift type, including other |
| 213 | containers or structs. |
| 214 | |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 215 | \begin{verbatim} |
| 216 | struct Example { |
| 217 | 1:i32 number=10, |
| 218 | 2:i64 bigNumber, |
| 219 | 3:double decimals, |
| 220 | 4:string name="thrifty" |
| 221 | }\end{verbatim} |
| 222 | |
| 223 | In the target language, each definition generates a type with two methods, |
| 224 | \texttt{read} and \texttt{write}, which perform serialization and transport |
| 225 | of the objects using a Thrift TProtocol object. |
| 226 | |
| 227 | \subsection{Exceptions} |
| 228 | |
| 229 | Exceptions are syntactically and functionally equivalent to structs except |
| 230 | that they are declared using the \texttt{exception} keyword instead of the |
| 231 | \texttt{struct} keyword. |
| 232 | |
| 233 | The generated objects inherit from an exception base class as appropriate |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 234 | in each target programming language, in order to seamlessly |
| 235 | integrate with native exception handling in any given |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 236 | language. Again, the design emphasis is on making the code familiar to the |
| 237 | application developer. |
| 238 | |
| 239 | \subsection{Services} |
| 240 | |
| 241 | Services are defined using Thrift types. Definition of a service is |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 242 | semantically equivalent to defining an interface (or a pure virtual abstract |
| 243 | class) in object oriented |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 244 | programming. The Thrift compiler generates fully functional client and |
| 245 | server stubs that implement the interface. Services are defined as follows: |
| 246 | |
| 247 | \begin{verbatim} |
| 248 | service <name> { |
| 249 | <returntype> <name>(<arguments>) |
| 250 | [throws (<exceptions>)] |
| 251 | ... |
| 252 | }\end{verbatim} |
| 253 | |
| 254 | An example: |
| 255 | |
| 256 | \begin{verbatim} |
| 257 | service StringCache { |
| 258 | void set(1:i32 key, 2:string value), |
| 259 | string get(1:i32 key) throws (1:KeyNotFound knf), |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 260 | void delete(1:i32 key) |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 261 | } |
| 262 | \end{verbatim} |
| 263 | |
| 264 | Note that \texttt{void} is a valid type for a function return, in addition to |
| 265 | all other defined Thrift types. Additionally, an \texttt{async} modifier |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 266 | keyword may be added to a \texttt{void} function, which will generate code that does |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 267 | not wait for a response from the server. Note that a pure \texttt{void} |
| 268 | function will return a response to the client which guarantees that the |
| 269 | operation has completed on the server side. With \texttt{async} method calls |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 270 | the client will only be guaranteed that the request succeeded at the |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 271 | transport layer. (In many transport scenarios this is inherently unreliable |
| 272 | due to the Byzantine Generals' Problem. Therefore, application developers |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 273 | should take care only to use the async optimization in cases where dropped |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 274 | method calls are acceptable or the transport is known to be reliable.) |
| 275 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 276 | Also of note is the fact that argument lists and exception lists for functions |
| 277 | are implemented as Thrift structs. All three constructs are identical in both |
| 278 | notation and behavior. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 279 | |
| 280 | \section{Transport} |
| 281 | |
| 282 | The transport layer is used by the generated code to facilitate data transfer. |
| 283 | |
| 284 | \subsection{Interface} |
| 285 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 286 | A key design choice in the implementation of Thrift was to decouple the |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 287 | transport layer from the code generation layer. Though Thrift is typically |
| 288 | used on top of the TCP/IP stack with streaming sockets as the base layer of |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 289 | communication, there was no compelling reason to build that constraint into |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 290 | the system. The performance tradeoff incurred by an abstracted I/O layer |
| 291 | (roughly one virtual method lookup / function call per operation) was |
| 292 | immaterial compared to the cost of actual I/O operations (typically invoking |
| 293 | system calls). |
| 294 | |
Aditya Agarwal | af524ee | 2007-03-31 08:28:06 +0000 | [diff] [blame] | 295 | Fundamentally, generated Thrift code only needs to know how to read and |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 296 | write data. The origin and destination of the data are irrelevant; it may be a |
| 297 | socket, a segment of shared memory, or a file on the local disk. The Thrift |
| 298 | transport interface supports the following methods: |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 299 | |
| 300 | \begin{itemize} |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 301 | \item \texttt{open} Opens the tranpsort |
| 302 | \item \texttt{close} Closes the tranport |
| 303 | \item \texttt{isOpen} Indicates whether the transport is open |
| 304 | \item \texttt{read} Reads from the transport |
| 305 | \item \texttt{write} Writes to the transport |
| 306 | \item \texttt{flush} Forces any pending writes |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 307 | \end{itemize} |
| 308 | |
| 309 | There are a few additional methods not documented here which are used to aid |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 310 | in batching reads and optionally signaling the completion of a read or |
| 311 | write operation from the generated code. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 312 | |
| 313 | In addition to the above |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 314 | \texttt{TTransport} interface, there is a\\ |
| 315 | \texttt{TServerTransport} interface |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 316 | used to accept or create primitive transport objects. Its interface is as |
| 317 | follows: |
| 318 | |
| 319 | \begin{itemize} |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 320 | \item \texttt{open} Opens the transport |
| 321 | \item \texttt{listen} Begins listening for connections |
| 322 | \item \texttt{accept} Returns a new client transport |
| 323 | \item \texttt{close} Closes the transport |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 324 | \end{itemize} |
| 325 | |
| 326 | \subsection{Implementation} |
| 327 | |
| 328 | The transport interface is designed for simple implementation in any |
| 329 | programming language. New transport mechanisms can be easily defined as needed |
| 330 | by application developers. |
| 331 | |
| 332 | \subsubsection{TSocket} |
| 333 | |
| 334 | The \texttt{TSocket} class is implemented across all target languages. It |
| 335 | provides a common, simple interface to a TCP/IP stream socket. |
| 336 | |
| 337 | \subsubsection{TFileTransport} |
| 338 | |
| 339 | The \texttt{TFileTransport} is an abstraction of an on-disk file to a data |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 340 | stream. It can be used to write out a set of incoming Thrift requests to a file |
| 341 | on disk. The on-disk data can then be replayed from the log, either for |
| 342 | post-processing or for reproduction and/or simulation of past events. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 343 | |
| 344 | \subsubsection{Utilities} |
| 345 | |
| 346 | The Transport interface is designed to support easy extension using common |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 347 | OOP techniques, such as composition. Some simple utilites include the |
| 348 | \texttt{TBufferedTransport}, which buffers the writes and reads on an |
| 349 | underlying transport, the \texttt{TFramedTransport}, which transmits data with frame |
| 350 | size headers for chunking optimization or nonblocking operation, and the |
| 351 | \texttt{TMemoryBuffer}, which allows reading and writing directly from the heap |
| 352 | or stack memory owned by the process. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 353 | |
| 354 | \section{Protocol} |
| 355 | |
| 356 | A second major abstraction in Thrift is the separation of data structure from |
| 357 | transport representation. Thrift enforces a certain messaging structure when |
| 358 | transporting data, but it is agnostic to the protocol encoding in use. That is, |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 359 | it does not matter whether data is encoded as XML, human-readable ASCII, or a |
| 360 | dense binary format as long as the data supports a fixed set of operations |
| 361 | that allow it to be deterministically read and written by generated code. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 362 | |
| 363 | \subsection{Interface} |
| 364 | |
| 365 | The Thrift Protocol interface is very straightforward. It fundamentally |
| 366 | supports two things: 1) bidirectional sequenced messaging, and |
| 367 | 2) encoding of base types, containers, and structs. |
| 368 | |
| 369 | \begin{verbatim} |
| 370 | writeMessageBegin(name, type, seq) |
| 371 | writeMessageEnd() |
| 372 | writeStructBegin(name) |
| 373 | writeStructEnd() |
| 374 | writeFieldBegin(name, type, id) |
| 375 | writeFieldEnd() |
| 376 | writeFieldStop() |
| 377 | writeMapBegin(ktype, vtype, size) |
| 378 | writeMapEnd() |
| 379 | writeListBegin(etype, size) |
| 380 | writeListEnd() |
| 381 | writeSetBegin(etype, size) |
| 382 | writeSetEnd() |
| 383 | writeBool(bool) |
| 384 | writeByte(byte) |
| 385 | writeI16(i16) |
| 386 | writeI32(i32) |
| 387 | writeI64(i64) |
| 388 | writeDouble(double) |
| 389 | writeString(string) |
| 390 | |
| 391 | name, type, seq = readMessageBegin() |
| 392 | readMessageEnd() |
| 393 | name = readStructBegin() |
| 394 | readStructEnd() |
| 395 | name, type, id = readFieldBegin() |
| 396 | readFieldEnd() |
| 397 | k, v, size = readMapBegin() |
| 398 | readMapEnd() |
| 399 | etype, size = readListBegin() |
| 400 | readListEnd() |
| 401 | etype, size = readSetBegin() |
| 402 | readSetEnd() |
| 403 | bool = readBool() |
| 404 | byte = readByte() |
| 405 | i16 = readI16() |
| 406 | i32 = readI32() |
| 407 | i64 = readI64() |
| 408 | double = readDouble() |
| 409 | string = readString() |
| 410 | \end{verbatim} |
| 411 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 412 | Note that every \texttt{write} function has exactly one \texttt{read} counterpart, with |
| 413 | the exception of \texttt{writeFieldStop()}. This is a special method |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 414 | that signals the end of a struct. The procedure for reading a struct is to |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 415 | \texttt{readFieldBegin()} until the stop field is encountered, and then to |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 416 | \texttt{readStructEnd()}. The |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 417 | generated code relies upon this call sequence to ensure that everything written by |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 418 | a protocol encoder can be read by a matching protocol decoder. Further note |
| 419 | that this set of functions is by design more robust than necessary. |
| 420 | For example, \texttt{writeStructEnd()} is not strictly necessary, as the end of |
| 421 | a struct may be implied by the stop field. This method is a convenience for |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 422 | verbose protocols in which it is cleaner to separate these calls (e.g. a closing |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 423 | \texttt{</struct>} tag in XML). |
| 424 | |
| 425 | \subsection{Structure} |
| 426 | |
| 427 | Thrift structures are designed to support encoding into a streaming |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 428 | protocol. The implementation should never need to frame or compute the |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 429 | entire data length of a structure prior to encoding it. This is critical to |
| 430 | performance in many scenarios. Consider a long list of relatively large |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 431 | strings. If the protocol interface required reading or writing a list to be an |
| 432 | atomic operation, then the implementation would need to perform a linear pass over the |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 433 | entire list before encoding any data. However, if the list can be written |
| 434 | as iteration is performed, the corresponding read may begin in parallel, |
Aditya Agarwal | af524ee | 2007-03-31 08:28:06 +0000 | [diff] [blame] | 435 | theoretically offering an end-to-end speedup of $(kN - C)$, where $N$ is the size |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 436 | of the list, $k$ the cost factor associated with serializing a single |
| 437 | element, and $C$ is fixed offset for the delay between data being written |
| 438 | and becoming available to read. |
| 439 | |
| 440 | Similarly, structs do not encode their data lengths a priori. Instead, they are |
| 441 | encoded as a sequence of fields, with each field having a type specifier and a |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 442 | unique field identifier. Note that the inclusion of type specifiers allows |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 443 | the protocol to be safely parsed and decoded without any generated code |
| 444 | or access to the original IDL file. Structs are terminated by a field header |
| 445 | with a special \texttt{STOP} type. Because all the basic types can be read |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 446 | deterministically, all structs (even those containing other structs) can be |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 447 | read deterministically. The Thrift protocol is self-delimiting without any |
| 448 | framing and regardless of the encoding format. |
| 449 | |
| 450 | In situations where streaming is unnecessary or framing is advantageous, it |
| 451 | can be very simply added into the transport layer, using the |
| 452 | \texttt{TFramedTransport} abstraction. |
| 453 | |
| 454 | \subsection{Implementation} |
| 455 | |
| 456 | Facebook has implemented and deployed a space-efficient binary protocol which |
| 457 | is used by most backend services. Essentially, it writes all data |
| 458 | in a flat binary format. Integer types are converted to network byte order, |
| 459 | strings are prepended with their byte length, and all message and field headers |
| 460 | are written using the primitive integer serialization constructs. String names |
| 461 | for fields are omitted - when using generated code, field identifiers are |
| 462 | sufficient. |
| 463 | |
| 464 | We decided against some extreme storage optimizations (i.e. packing |
| 465 | small integers into ASCII or using a 7-bit continuation format) for the sake |
| 466 | of simplicity and clarity in the code. These alterations can easily be made |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 467 | if and when we encounter a performance-critical use case that demands them. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 468 | |
| 469 | \section{Versioning} |
| 470 | |
| 471 | Thrift is robust in the face of versioning and data definition changes. This |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 472 | is critical to enable staged rollouts of changes to deployed services. The |
| 473 | system must be able to support reading of old data from log files, as well as |
| 474 | requests from out-of-date clients to new servers, and vice versa. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 475 | |
| 476 | \subsection{Field Identifiers} |
| 477 | |
| 478 | Versioning in Thrift is implemented via field identifiers. The field header |
| 479 | for every member of a struct in Thrift is encoded with a unique field |
| 480 | identifier. The combination of this field identifier and its type specifier |
| 481 | is used to uniquely identify the field. The Thrift definition language |
| 482 | supports automatic assignment of field identifiers, but it is good |
| 483 | programming practice to always explicitly specify field identifiers. |
| 484 | Identifiers are specified as follows: |
| 485 | |
| 486 | \begin{verbatim} |
| 487 | struct Example { |
| 488 | 1:i32 number=10, |
| 489 | 2:i64 bigNumber, |
| 490 | 3:double decimals, |
| 491 | 4:string name="thrifty" |
| 492 | }\end{verbatim} |
| 493 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 494 | To avoid conflicts between manually and automatically assigned identifiers, |
| 495 | fields with identifiers omitted are assigned identifiers |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 496 | decrementing from -1, and the language only supports the manual assignment of |
| 497 | positive identifiers. |
| 498 | |
| 499 | When data is being deserialized, the generated code can use these identifiers |
| 500 | to properly identify the field and determine whether it aligns with a field in |
| 501 | its definition file. If a field identifier is not recognized, the generated |
| 502 | code can use the type specifier to skip the unknown field without any error. |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 503 | Again, this is possible due to the fact that all datatypes are self |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 504 | delimiting. |
| 505 | |
| 506 | Field identifiers can (and should) also be specified in function argument |
| 507 | lists. In fact, argument lists are not only represented as structs on the |
| 508 | backend, but actually share the same code in the compiler frontend. This |
| 509 | allows for version-safe modification of method parameters |
| 510 | |
| 511 | \begin{verbatim} |
| 512 | service StringCache { |
| 513 | void set(1:i32 key, 2:string value), |
| 514 | string get(1:i32 key) throws (1:KeyNotFound knf), |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 515 | void delete(1:i32 key) |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 516 | } |
| 517 | \end{verbatim} |
| 518 | |
| 519 | The syntax for specifying field identifiers was chosen to echo their structure. |
| 520 | Structs can be thought of as a dictionary where the identifiers are keys, and |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 521 | the values are strongly-typed named fields. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 522 | |
| 523 | Field identifiers internally use the \texttt{i16} Thrift type. Note, however, |
| 524 | that the \texttt{TProtocol} abstraction may encode identifiers in any format. |
| 525 | |
| 526 | \subsection{Isset} |
| 527 | |
| 528 | When an unexpected field is encountered, it can be safely ignored and |
| 529 | discarded. When an expected field is not found, there must be some way to |
| 530 | signal to the developer that it was not present. This is implemented via an |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 531 | inner \texttt{isset} structure inside the defined objects. (Isset functionality |
| 532 | is implicit with a \texttt{null} value in PHP, \texttt{None} in Python |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 533 | and \texttt{nil} in Ruby.) Essentially, |
| 534 | the inner \texttt{isset} object of each Thrift struct contains a boolean value |
| 535 | for each field which denotes whether or not that field is present in the |
| 536 | struct. When a reader receives a struct, it should check for a field being set |
| 537 | before operating directly on it. |
| 538 | |
| 539 | \begin{verbatim} |
| 540 | class Example { |
| 541 | public: |
| 542 | Example() : |
| 543 | number(10), |
| 544 | bigNumber(0), |
| 545 | decimals(0), |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 546 | name("thrifty") {} |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 547 | |
| 548 | int32_t number; |
| 549 | int64_t bigNumber; |
| 550 | double decimals; |
| 551 | std::string name; |
| 552 | |
| 553 | struct __isset { |
| 554 | __isset() : |
| 555 | number(false), |
| 556 | bigNumber(false), |
| 557 | decimals(false), |
| 558 | name(false) {} |
| 559 | bool number; |
| 560 | bool bigNumber; |
| 561 | bool decimals; |
| 562 | bool name; |
| 563 | } __isset; |
| 564 | ... |
| 565 | } |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 566 | \end{verbatim} |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 567 | |
| 568 | \subsection{Case Analysis} |
| 569 | |
| 570 | There are four cases in which version mismatches may occur. |
| 571 | |
| 572 | \begin{enumerate} |
| 573 | \item \textit{Added field, old client, new server.} In this case, the old |
| 574 | client does not send the new field. The new server recognizes that the field |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 575 | is not set, and implements default behavior for out-of-date requests. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 576 | \item \textit{Removed field, old client, new server.} In this case, the old |
| 577 | client sends the removed field. The new server simply ignores it. |
| 578 | \item \textit{Added field, new client, old server.} The new client sends a |
| 579 | field that the old server does not recognize. The old server simply ignores |
| 580 | it and processes as normal. |
| 581 | \item \textit{Removed field, new client, old server.} This is the most |
| 582 | dangerous case, as the old server is unlikely to have suitable default |
| 583 | behavior implemented for the missing field. It is recommended that in this |
| 584 | situation the new server be rolled out prior to the new clients. |
| 585 | \end{enumerate} |
| 586 | |
| 587 | \subsection{Protocol/Transport Versioning} |
| 588 | The \texttt{TProtocol} abstractions are also designed to give protocol |
| 589 | implementations the freedom to version themselves in whatever manner they |
| 590 | see fit. Specifically, any protocol implementation is free to send whatever |
| 591 | it likes in the \texttt{writeMessageBegin()} call. It is entirely up to the |
| 592 | implementor how to handle versioning at the protocol level. The key point is |
| 593 | that protocol encoding changes are safely isolated from interface definition |
| 594 | version changes. |
| 595 | |
| 596 | Note that the exact same is true of the \texttt{TTransport} interface. For |
| 597 | example, if we wished to add some new checksumming or error detection to the |
| 598 | \texttt{TFileTransport}, we could simply add a version header into the |
| 599 | data it writes to the file in such a way that it would still accept old |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 600 | log files without the given header. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 601 | |
| 602 | \section{RPC Implementation} |
| 603 | |
| 604 | \subsection{TProcessor} |
| 605 | |
| 606 | The last core interface in the Thrift design is the \texttt{TProcessor}, |
| 607 | perhaps the most simple of the constructs. The interface is as follows: |
| 608 | |
| 609 | \begin{verbatim} |
| 610 | interface TProcessor { |
| 611 | bool process(TProtocol in, TProtocol out) |
| 612 | throws TException |
| 613 | } |
| 614 | \end{verbatim} |
| 615 | |
| 616 | The key design idea here is that the complex systems we build can fundamentally |
| 617 | be broken down into agents or services that operate on inputs and outputs. In |
| 618 | most cases, there is actually just one input and output (an RPC client) that |
| 619 | needs handling. |
| 620 | |
| 621 | \subsection{Generated Code} |
| 622 | |
| 623 | When a service is defined, we generate a |
| 624 | \texttt{TProcessor} instance capable of handling RPC requests to that service, |
| 625 | using a few helpers. The fundamental structure (illustrated in pseudo-C++) is |
| 626 | as follows: |
| 627 | |
| 628 | \begin{verbatim} |
| 629 | Service.thrift |
| 630 | => Service.cpp |
| 631 | interface ServiceIf |
| 632 | class ServiceClient : virtual ServiceIf |
| 633 | TProtocol in |
| 634 | TProtocol out |
| 635 | class ServiceProcessor : TProcessor |
| 636 | ServiceIf handler |
| 637 | |
| 638 | ServiceHandler.cpp |
| 639 | class ServiceHandler : virtual ServiceIf |
| 640 | |
| 641 | TServer.cpp |
| 642 | TServer(TProcessor processor, |
| 643 | TServerTransport transport, |
| 644 | TTransportFactory tfactory, |
| 645 | TProtocolFactory pfactory) |
| 646 | serve() |
| 647 | \end{verbatim} |
| 648 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 649 | From the Thrift definition file, we generate the virtual service interface. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 650 | A client class is generated, which implements the interface and |
| 651 | uses two \texttt{TProtocol} instances to perform the I/O operations. The |
| 652 | generated processor implements the \texttt{TProcessor} interface. The generated |
| 653 | code has all the logic to handle RPC invocations via the \texttt{process()} |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 654 | call, and takes as a parameter an instance of the service interface, as |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 655 | implemented by the application developer. |
| 656 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 657 | The user provides an implementation of the application interface in separate, |
| 658 | non-generated source code. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 659 | |
| 660 | \subsection{TServer} |
| 661 | |
| 662 | Finally, the Thrift core libraries provide a \texttt{TServer} abstraction. |
| 663 | The \texttt{TServer} object generally works as follows. |
| 664 | |
| 665 | \begin{itemize} |
| 666 | \item Use the \texttt{TServerTransport} to get a \texttt{TTransport} |
| 667 | \item Use the \texttt{TTransportFactory} to optionally convert the primitive |
| 668 | transport into a suitable application transport (typically the |
| 669 | \texttt{TBufferedTransportFactory} is used here) |
| 670 | \item Use the \texttt{TProtocolFactory} to create an input and output protocol |
| 671 | for the \texttt{TTransport} |
| 672 | \item Invoke the \texttt{process()} method of the \texttt{TProcessor} object |
| 673 | \end{itemize} |
| 674 | |
| 675 | The layers are appropriately separated such that the server code needs to know |
| 676 | nothing about any of the transports, encodings, or applications in play. The |
| 677 | server encapsulates the logic around connection handling, threading, etc. |
| 678 | while the processor deals with RPC. The only code written by the application |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 679 | developer lives in the definitional Thrift file and the interface |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 680 | implementation. |
| 681 | |
| 682 | Facebook has deployed multiple \texttt{TServer} implementations, including |
| 683 | the single-threaded \texttt{TSimpleServer}, thread-per-connection |
| 684 | \texttt{TThreadedServer}, and thread-pooling \texttt{TThreadPoolServer}. |
| 685 | |
| 686 | The \texttt{TProcessor} interface is very general by design. There is no |
| 687 | requirement that a \texttt{TServer} take a generated \texttt{TProcessor} |
| 688 | object. Thrift allows the application developer to easily write any type of |
| 689 | server that operates on \texttt{TProtocol} objects (for instance, a server |
| 690 | could simply stream a certain type of object without any actual RPC method |
| 691 | invocation). |
| 692 | |
| 693 | \section{Implementation Details} |
| 694 | \subsection{Target Languages} |
| 695 | Thrift currently supports five target languages: C++, Java, Python, Ruby, and |
| 696 | PHP. At Facebook, we have deployed servers predominantly in C++, Java, and |
| 697 | Python. Thrift services implemented in PHP have also been embedded into the |
| 698 | Apache web server, providing transparent backend access to many of our |
| 699 | frontend constructs using a \texttt{THttpClient} implementation of the |
| 700 | \texttt{TTransport} interface. |
| 701 | |
| 702 | Though Thrift was explicitly designed to be much more efficient and robust |
| 703 | than typical web technologies, as we were designing our XML-based REST web |
| 704 | services API we noticed that Thrift could be easily used to define our |
| 705 | service interface. Though we do not currently employ SOAP envelopes (in the |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 706 | authors' opinions there is already far too much repetitive enterprise Java |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 707 | software to do that sort of thing), we were able to quickly extend Thrift to |
| 708 | generate XML Schema Definition files for our service, as well as a framework |
| 709 | for versioning different implementations of our web service. Though public |
| 710 | web services are admittedly tangential to Thrift's core use case and design, |
| 711 | Thrift facilitated rapid iteration and affords us the ability to quickly |
| 712 | migrate our entire XML-based web service onto a higher performance system |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 713 | should the need arise. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 714 | |
| 715 | \subsection{Generated Structs} |
| 716 | We made a conscious decision to make our generated structs as transparent as |
| 717 | possible. All fields are publicly accessible; there are no \texttt{set()} and |
| 718 | \texttt{get()} methods. Similarly, use of the \texttt{isset} object is not |
| 719 | enforced. We do not include any \texttt{FieldNotSetException} construct. |
| 720 | Developers have the option to use these fields to write more robust code, but |
| 721 | the system is robust to the developer ignoring the \texttt{isset} construct |
| 722 | entirely and will provide suitable default behavior in all cases. |
| 723 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 724 | This choice was motivated by the desire to ease application development. Our stated |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 725 | goal is not to make developers learn a rich new library in their language of |
| 726 | choice, but rather to generate code that allow them to work with the constructs |
| 727 | that are most familiar in each language. |
| 728 | |
| 729 | We also made the \texttt{read()} and \texttt{write()} methods of the generated |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 730 | objects public so that the objects can be used outside of the context |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 731 | of RPC clients and servers. Thrift is a useful tool simply for generating |
| 732 | objects that are easily serializable across programming languages. |
| 733 | |
| 734 | \subsection{RPC Method Identification} |
| 735 | Method calls in RPC are implemented by sending the method name as a string. One |
| 736 | issue with this approach is that longer method names require more bandwidth. |
| 737 | We experimented with using fixed-size hashes to identify methods, but in the |
| 738 | end concluded that the savings were not worth the headaches incurred. Reliably |
| 739 | dealing with conflicts across versions of an interface definition file is |
| 740 | impossible without a meta-storage system (i.e. to generate non-conflicting |
| 741 | hashes for the current version of a file, we would have to know about all |
| 742 | conflicts that ever existed in any previous version of the file). |
| 743 | |
| 744 | We wanted to avoid too many unnecessary string comparisons upon |
| 745 | method invocation. To deal with this, we generate maps from strings to function |
| 746 | pointers, so that invocation is effectively accomplished via a constant-time |
| 747 | hash lookup in the common case. This requires the use of a couple interesting |
| 748 | code constructs. Because Java does not have function pointers, process |
| 749 | functions are all private member classes implementing a common interface. |
| 750 | |
| 751 | \begin{verbatim} |
| 752 | private class ping implements ProcessFunction { |
| 753 | public void process(int seqid, |
| 754 | TProtocol iprot, |
| 755 | TProtocol oprot) |
| 756 | throws TException |
| 757 | { ...} |
| 758 | } |
| 759 | |
| 760 | HashMap<String,ProcessFunction> processMap_ = |
| 761 | new HashMap<String,ProcessFunction>(); |
| 762 | \end{verbatim} |
| 763 | |
| 764 | In C++, we use a relatively esoteric language construct: member function |
| 765 | pointers. |
| 766 | |
| 767 | \begin{verbatim} |
| 768 | std::map<std::string, |
| 769 | void (ExampleServiceProcessor::*)(int32_t, |
| 770 | facebook::thrift::protocol::TProtocol*, |
| 771 | facebook::thrift::protocol::TProtocol*)> |
| 772 | processMap_; |
| 773 | \end{verbatim} |
| 774 | |
| 775 | Using these techniques, the cost of string processing is minimized, and we |
| 776 | reap the benefit of being able to easily debug corrupt or misunderstood data by |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 777 | inspecting it for known string method names. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 778 | |
| 779 | \subsection{Servers and Multithreading} |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 780 | Thrift services require basic multithreading to handle simultaneous |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 781 | requests from multiple clients. For the Python and Java implementations of |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 782 | Thrift server logic, the standard threading libraries distributed with the |
| 783 | languages provide adequate support. For the C++ implementation, no standard multithread runtime |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 784 | library exists. Specifically, robust, lightweight, and portable |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 785 | thread manager and timer class implementations do not exist. We investigated |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 786 | existing implementations, namely \texttt{boost::thread}, |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 787 | \texttt{boost::threadpool}, \texttt{ACE\_Thread\_Manager} and |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 788 | \texttt{ACE\_Timer}. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 789 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 790 | While \texttt{boost::threads}\cite{boost.threads} provides clean, |
| 791 | lightweight and robust implementations of multi-thread primitives (mutexes, |
| 792 | conditions, threads) it does not provide a thread manager or timer |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 793 | implementation. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 794 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 795 | \texttt{boost::threadpool}\cite{boost.threadpool} also looked promising but |
| 796 | was not far enough along for our purposes. We wanted to limit the dependency on |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 797 | third-party libraries as much as possible. Because\\ |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 798 | \texttt{boost::threadpool} is |
| 799 | not a pure template library and requires runtime libraries and because it is |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 800 | not yet part of the official Boost distribution we felt it was not ready for |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 801 | use in Thrift. As \texttt{boost::threadpool} evolves and especially if it is |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 802 | added to the Boost distribution we may reconsider our decision to not use it. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 803 | |
| 804 | ACE has both a thread manager and timer class in addition to multi-thread |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 805 | primitives. The biggest problem with ACE is that it is ACE. Unlike Boost, ACE |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 806 | API quality is poor. Everything in ACE has large numbers of dependencies on |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 807 | everything else in ACE - thus forcing developers to throw out standard |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 808 | classes, such as STL collections, in favor of ACE's homebrewed implementations. In |
| 809 | addition, unlike Boost, ACE implementations demonstrate little understanding |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 810 | of the power and pitfalls of C++ programming and take no advantage of modern |
| 811 | templating techniques to ensure compile time safety and reasonable compiler |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 812 | error messages. For all these reasons, ACE was rejected. Instead, we chose |
| 813 | to implement our own library, described in the following sections. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 814 | |
| 815 | \subsection{Thread Primitives} |
| 816 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 817 | The Thrift thread libraries are implemented in the namespace\\ |
| 818 | \texttt{facebook::thrift::concurrency} and have three components: |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 819 | \begin{itemize} |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 820 | \item primitives |
| 821 | \item thread pool manager |
| 822 | \item timer manager |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 823 | \end{itemize} |
| 824 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 825 | As mentioned above, we were hesitant to introduce any additional dependencies |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 826 | on Thrift. We decided to use \texttt{boost::shared\_ptr} because it is so |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 827 | useful for multithreaded application, it requires no link-time or |
| 828 | runtime libraries (i.e. it is a pure template library) and it is due |
| 829 | to become part of the C++0x standard. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 830 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 831 | We implement standard \texttt{Mutex} and \texttt{Condition} classes, and a |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 832 | \texttt{Monitor} class. The latter is simply a combination of a mutex and |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 833 | condition variable and is analogous to the \texttt{Monitor} implementation provided for |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 834 | the Java \texttt{Object} class. This is also sometimes referred to as a barrier. We |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 835 | provide a \texttt{Synchronized} guard class to allow Java-like synchronized blocks. |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 836 | This is just a bit of syntactic sugar, but, like its Java counterpart, clearly |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 837 | delimits critical sections of code. Unlike its Java counterpart, we still |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 838 | have the ability to programmatically lock, unlock, block, and signal monitors. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 839 | |
| 840 | \begin{verbatim} |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 841 | void run() { |
| 842 | {Synchronized s(manager->monitor); |
| 843 | if (manager->state == TimerManager::STARTING) { |
| 844 | manager->state = TimerManager::STARTED; |
| 845 | manager->monitor.notifyAll(); |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 846 | } |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 847 | } |
| 848 | } |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 849 | \end{verbatim} |
| 850 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 851 | We again borrowed from Java the distinction between a thread and a runnable |
| 852 | class. A \texttt{Thread} is the actual schedulable object. The |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 853 | \texttt{Runnable} is the logic to execute within the thread. |
| 854 | The \texttt{Thread} implementation deals with all the platform-specific thread |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 855 | creation and destruction issues, while the \texttt{Runnable} implementation deals |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 856 | with the application-specific per-thread logic. The benefit of this approach |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 857 | is that developers can easily subclass the Runnable class without pulling in |
Aditya Agarwal | 8c5daa4 | 2007-04-05 01:17:52 +0000 | [diff] [blame] | 858 | platform-specific super-classes. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 859 | |
| 860 | \subsection{Thread, Runnable, and shared\_ptr} |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 861 | We use \texttt{boost::shared\_ptr} throughout the \texttt{ThreadManager} and |
| 862 | \texttt{TimerManager} implementations to guarantee cleanup of dead objects that can |
| 863 | be accessed by multiple threads. For \texttt{Thread} class implementations, |
| 864 | \texttt{boost::shared\_ptr} usage requires particular attention to make sure |
| 865 | \texttt{Thread} objects are neither leaked nor dereferenced prematurely while |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 866 | creating and shutting down threads. |
| 867 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 868 | Thread creation requires calling into a C library. (In our case the POSIX |
Mark Slee | 09bfd61 | 2007-04-04 19:56:41 +0000 | [diff] [blame] | 869 | thread library, \texttt{libpthread}, but the same would be true for WIN32 threads). |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 870 | Typically, the OS makes few, if any, guarantees about when \texttt{ThreadMain}, a C thread's entry-point function, will be called. Therefore, it is |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 871 | possible that our thread create call, |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 872 | \texttt{ThreadFactory::newThread()} could return to the caller |
| 873 | well before that time. To ensure that the returned \texttt{Thread} object is not |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 874 | prematurely cleaned up if the caller gives up its reference prior to the |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 875 | \texttt{ThreadMain} call, the \texttt{Thread} object makes a weak referenence to |
| 876 | itself in its \texttt{start} method. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 877 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 878 | With the weak reference in hand the \texttt{ThreadMain} function can attempt to get |
| 879 | a strong reference before entering the \texttt{Runnable::run} method of the |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 880 | \texttt{Runnable} object bound to the \texttt{Thread}. If no strong references to the |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 881 | thread are obtained between exiting \texttt{Thread::start} and entering \texttt{ThreadMain}, the weak reference returns \texttt{null} and the function |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 882 | exits immediately. |
| 883 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 884 | The need for the \texttt{Thread} to make a weak reference to itself has a |
| 885 | significant impact on the API. Since references are managed through the |
| 886 | \texttt{boost::shared\_ptr} templates, the \texttt{Thread} object must have a reference |
| 887 | to itself wrapped by the same \texttt{boost::shared\_ptr} envelope that is returned |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 888 | to the caller. This necessitated the use of the factory pattern. |
| 889 | \texttt{ThreadFactory} creates the raw \texttt{Thread} object and a |
| 890 | \texttt{boost::shared\_ptr} wrapper, and calls a private helper method of the class |
| 891 | implementing the \texttt{Thread} interface (in this case, \texttt{PosixThread::weakRef}) |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 892 | to allow it to make add weak reference to itself through the |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 893 | \texttt{boost::shared\_ptr} envelope. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 894 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 895 | \texttt{Thread} and \texttt{Runnable} objects reference each other. A \texttt{Runnable} |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 896 | object may need to know about the thread in which it is executing, and a Thread, obviously, |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 897 | needs to know what \texttt{Runnable} object it is hosting. This interdependency is |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 898 | further complicated because the lifecycle of each object is independent of the |
Mark Slee | 8142b9d | 2007-04-03 05:49:12 +0000 | [diff] [blame] | 899 | other. An application may create a set of \texttt{Runnable} object to be reused in different threads, or it may create and forget a \texttt{Runnable} object |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 900 | once a thread has been created and started for it. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 901 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 902 | The \texttt{Thread} class takes a \texttt{boost::shared\_ptr} reference to the hosted |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 903 | \texttt{Runnable} object in its constructor, while the \texttt{Runnable} class has an |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 904 | explicit \texttt{thread} method to allow explicit binding of the hosted thread. |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 905 | \texttt{ThreadFactory::newThread} binds the objects to each other. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 906 | |
| 907 | \subsection{ThreadManager} |
| 908 | |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 909 | \texttt{ThreadManager} creates a pool of worker threads and |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 910 | allows applications to schedule tasks for execution as free worker threads |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 911 | become available. The \texttt{ThreadManager} does not implement dynamic |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 912 | thread pool resizing, but provides primitives so that applications can add |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 913 | and remove threads based on load. This approach was chosen because |
| 914 | implementing load metrics and thread pool size is very application |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 915 | specific. For example some applications may want to adjust pool size based |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 916 | on running-average of work arrival rates that are measured via polled |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 917 | samples. Others may simply wish to react immediately to work-queue |
| 918 | depth high and low water marks. Rather than trying to create a complex |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 919 | API abstract enough to capture these different approaches, we |
| 920 | simply leave it up to the particular application and provide the |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 921 | primitives to enact the desired policy and sample current status. |
| 922 | |
| 923 | \subsection{TimerManager} |
| 924 | |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 925 | \texttt{TimerManager} allows applications to schedule |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 926 | \texttt{Runnable} objects for execution at some point in the future. Its specific task |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 927 | is to allows applications to sample \texttt{ThreadManager} load at regular |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 928 | intervals and make changes to the thread pool size based on application policy. |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 929 | Of course, it can be used to generate any number of timer or alarm events. |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 930 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 931 | The default implementation of \texttt{TimerManager} uses a single thread to |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 932 | execute expired \texttt{Runnable} objects. Thus, if a timer operation needs to |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 933 | do a large amount of work and especially if it needs to do blocking I/O, |
| 934 | that should be done in a separate thread. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 935 | |
| 936 | \subsection{Nonblocking Operation} |
| 937 | Though the Thrift transport interfaces map more directly to a blocking I/O |
| 938 | model, we have implemented a high performance \texttt{TNonBlockingServer} |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 939 | in C++ based on \texttt{libevent} and the \texttt{TFramedTransport}. We |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 940 | implemented this by moving all I/O into one tight event loop using a |
| 941 | state machine. Essentially, the event loop reads framed requests into |
| 942 | \texttt{TMemoryBuffer} objects. Once entire requests are ready, they are |
| 943 | dispatched to the \texttt{TProcessor} object which can read directly from |
| 944 | the data in memory. |
| 945 | |
| 946 | \subsection{Compiler} |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 947 | The Thrift compiler is implemented in C++ using standard \texttt{lex}/\texttt{yacc} |
| 948 | lexing and parsing. Though it could have been implemented with fewer |
| 949 | lines of code in another language (i.e. Python Lex-Yacc (PLY) or \texttt{ocamlyacc}), using C++ |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 950 | forces explicit definition of the language constructs. Strongly typing the |
| 951 | parse tree elements (debatably) makes the code more approachable for new |
| 952 | developers. |
| 953 | |
| 954 | Code generation is done using two passes. The first pass looks only for |
| 955 | include files and type definitions. Type definitions are not checked during |
| 956 | this phase, since they may depend upon include files. All included files |
| 957 | are sequentially scanned in a first pass. Once the include tree has been |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 958 | resolved, a second pass over all files is taken that inserts type definitions |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 959 | into the parse tree and raises an error on any undefined types. The program is |
| 960 | then generated against the parse tree. |
| 961 | |
| 962 | Due to inherent complexities and potential for circular dependencies, |
| 963 | we explicitly disallow forward declaration. Two Thrift structs cannot |
| 964 | each contain an instance of the other. (Since we do not allow \texttt{null} |
| 965 | struct instances in the generated C++ code, this would actually be impossible.) |
| 966 | |
Aditya Agarwal | af524ee | 2007-03-31 08:28:06 +0000 | [diff] [blame] | 967 | \subsection{TFileTransport} |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 968 | The \texttt{TFileTransport} logs Thrift requests/structs by |
| 969 | framing incoming data with its length and writing it out to disk. |
| 970 | Using a framed on-disk format allows for better error checking and |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 971 | helps with the processing of a finite number of discrete events. The\\ |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 972 | \texttt{TFileWriterTransport} uses a system of swapping in-memory buffers |
| 973 | to ensure good performance while logging large amounts of data. |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 974 | A Thrift log file is split up into chunks of a specified size; logged messages |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 975 | are not allowed to cross chunk boundaries. A message that would cross a chunk |
| 976 | boundary will cause padding to be added until the end of the chunk and the |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 977 | first byte of the message are aligned to the beginning of the next chunk. |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 978 | Partitioning the file into chunks makes it possible to read and interpret data |
| 979 | from a particular point in the file. |
Aditya Agarwal | af524ee | 2007-03-31 08:28:06 +0000 | [diff] [blame] | 980 | |
Mark Slee | c11045a | 2007-04-01 23:14:38 +0000 | [diff] [blame] | 981 | \section{Facebook Thrift Services} |
Aditya Agarwal | adf3e7f | 2007-03-31 16:56:14 +0000 | [diff] [blame] | 982 | Thrift has been employed in a large number of applications at Facebook, including |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 983 | search, logging, mobile, ads and the developer platform. Two specific usages are discussed below. |
Aditya Agarwal | adf3e7f | 2007-03-31 16:56:14 +0000 | [diff] [blame] | 984 | |
| 985 | \subsection{Search} |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 986 | Thrift is used as the underlying protocol and transport layer for the Facebook Search service. |
| 987 | The multi-language code generation is well suited for search because it allows for application |
Aditya Agarwal | adf3e7f | 2007-03-31 16:56:14 +0000 | [diff] [blame] | 988 | development in an efficient server side language (C++) and allows the Facebook PHP-based web application |
| 989 | to make calls to the search service using Thrift PHP libraries. There is also a large |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 990 | variety of search stats, deployment and testing functionality that is built on top |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 991 | of generated Python code. Additionally, the Thrift log file format is |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 992 | used as a redo log for providing real-time search index updates. Thrift has allowed the |
| 993 | search team to leverage each language for its strengths and to develop code at a rapid pace. |
Aditya Agarwal | adf3e7f | 2007-03-31 16:56:14 +0000 | [diff] [blame] | 994 | |
| 995 | \subsection{Logging} |
| 996 | The Thrift \texttt{TFileTransport} functionality is used for structured logging. Each |
| 997 | service function definition along with its parameters can be considered to be |
David Reiss | 0c90f6f | 2008-02-06 22:18:40 +0000 | [diff] [blame] | 998 | a structured log entry identified by the function name. This log can then be used for |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 999 | a variety of purposes, including inline and offline processing, stats aggregation and as a redo log. |
Aditya Agarwal | adf3e7f | 2007-03-31 16:56:14 +0000 | [diff] [blame] | 1000 | |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1001 | \section{Conclusions} |
| 1002 | Thrift has enabled Facebook to build scalable backend |
| 1003 | services efficiently by enabling engineers to divide and conquer. Application |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 1004 | developers can focus on application code without worrying about the |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1005 | sockets layer. We avoid duplicated work by writing buffering and I/O logic |
| 1006 | in one place, rather than interspersing it in each application. |
| 1007 | |
| 1008 | Thrift has been employed in a wide variety of applications at Facebook, |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 1009 | including search, logging, mobile, ads, and the developer platform. We have |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1010 | found that the marginal performance cost incurred by an extra layer of |
Mark Slee | afaf276 | 2007-04-03 19:47:04 +0000 | [diff] [blame] | 1011 | software abstraction is far eclipsed by the gains in developer efficiency and |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1012 | systems reliability. |
| 1013 | |
| 1014 | \appendix |
| 1015 | |
| 1016 | \section{Similar Systems} |
| 1017 | The following are software systems similar to Thrift. Each is (very!) briefly |
| 1018 | described: |
| 1019 | |
| 1020 | \begin{itemize} |
| 1021 | \item \textit{SOAP.} XML-based. Designed for web services via HTTP, excessive |
| 1022 | XML parsing overhead. |
| 1023 | \item \textit{CORBA.} Relatively comprehensive, debatably overdesigned and |
| 1024 | heavyweight. Comparably cumbersome software installation. |
| 1025 | \item \textit{COM.} Embraced mainly in Windows client softare. Not an entirely |
| 1026 | open solution. |
| 1027 | \item \textit{Pillar.} Lightweight and high-performance, but missing versioning |
| 1028 | and abstraction. |
| 1029 | \item \textit{Protocol Buffers.} Closed-source, owned by Google. Described in |
| 1030 | Sawzall paper. |
| 1031 | \end{itemize} |
| 1032 | |
| 1033 | \acks |
| 1034 | |
| 1035 | Many thanks for feedback on Thrift (and extreme trial by fire) are due to |
Aditya Agarwal | af524ee | 2007-03-31 08:28:06 +0000 | [diff] [blame] | 1036 | Martin Smith, Karl Voskuil and Yishan Wong. |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1037 | |
| 1038 | Thrift is a successor to Pillar, a similar system developed |
| 1039 | by Adam D'Angelo, first while at Caltech and continued later at Facebook. |
| 1040 | Thrift simply would not have happened without Adam's insights. |
| 1041 | |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 1042 | \begin{thebibliography}{} |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1043 | |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 1044 | \bibitem{boost.threads} |
| 1045 | Kempf, William, |
| 1046 | ``Boost.Threads'', |
| 1047 | \url{http://www.boost.org/doc/html/threads.html} |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1048 | |
Marc Slemko | 10b3bdb | 2007-04-01 09:14:05 +0000 | [diff] [blame] | 1049 | \bibitem{boost.threadpool} |
| 1050 | Henkel, Philipp, |
| 1051 | ``threadpool'', |
| 1052 | \url{http://threadpool.sourceforge.net} |
| 1053 | |
| 1054 | \end{thebibliography} |
Mark Slee | 24b49d3 | 2007-03-21 01:24:00 +0000 | [diff] [blame] | 1055 | |
| 1056 | \end{document} |