Thrift: Whitespace cleanup.
Summary:
- Expanded tabs to spaces where spaces were the norm.
- Deleted almost all trailing whitespace.
- Added newlines to the ends of a few files.
- Ran dos2unix on one file or two.
Reviewed By: mcslee
Test Plan: git diff -b
Revert Plan: ok
git-svn-id: https://svn.apache.org/repos/asf/incubator/thrift/trunk@665467 13f79535-47bb-0310-9956-ffa450edef68
diff --git a/doc/thrift.tex b/doc/thrift.tex
index fc1e6ba..17766b5 100644
--- a/doc/thrift.tex
+++ b/doc/thrift.tex
@@ -20,9 +20,9 @@
\begin{document}
-% \conferenceinfo{WXYZ '05}{date, City.}
+% \conferenceinfo{WXYZ '05}{date, City.}
% \copyrightyear{2007}
-% \copyrightdata{[to be supplied]}
+% \copyrightdata{[to be supplied]}
% \titlebanner{banner above paper title} % These are ignored unless
% \preprintfooter{short description of paper} % 'preprint' option specified.
@@ -62,7 +62,7 @@
\section{Introduction}
As Facebook's traffic and network structure have scaled, the resource
-demands of many operations on the site (i.e. search,
+demands of many operations on the site (i.e. search,
ad selection and delivery, event logging) have presented technical requirements
drastically outside the scope of the LAMP framework. In our implementation of
these services, various programming languages have been selected to
@@ -102,7 +102,7 @@
a C++ programmer should be able to transparently exchange a strongly typed
STL map for a dynamic Python dictionary. Neither
programmer should be forced to write any code below the application layer
-to achieve this. Section 2 details the Thrift type system.
+to achieve this. Section 2 details the Thrift type system.
\textit{Transport.} Each language must have a common interface to
bidirectional raw data transport. The specifics of how a given
@@ -195,7 +195,7 @@
contain duplicates.
\item \texttt{set<type>} An unordered set of unique elements. Translates into
an STL \texttt{set}, Java \texttt{HashSet}, \texttt{set} in Python, or native
-dictionary in PHP/Ruby.
+dictionary in PHP/Ruby.
\item \texttt{map<type1,type2>} A map of strictly unique keys to values
Translates into an STL \texttt{map}, Java \texttt{HashMap}, PHP associative
array, or Python/Ruby dictionary.
@@ -254,7 +254,7 @@
service StringCache {
void set(1:i32 key, 2:string value),
string get(1:i32 key) throws (1:KeyNotFound knf),
- void delete(1:i32 key)
+ void delete(1:i32 key)
}
\end{verbatim}
@@ -283,7 +283,7 @@
A key design choice in the implementation of Thrift was to decouple the
transport layer from the code generation layer. Though Thrift is typically
used on top of the TCP/IP stack with streaming sockets as the base layer of
-communication, there was no compelling reason to build that constraint into
+communication, there was no compelling reason to build that constraint into
the system. The performance tradeoff incurred by an abstracted I/O layer
(roughly one virtual method lookup / function call per operation) was
immaterial compared to the cost of actual I/O operations (typically invoking
@@ -509,7 +509,7 @@
service StringCache {
void set(1:i32 key, 2:string value),
string get(1:i32 key) throws (1:KeyNotFound knf),
- void delete(1:i32 key)
+ void delete(1:i32 key)
}
\end{verbatim}
@@ -540,7 +540,7 @@
number(10),
bigNumber(0),
decimals(0),
- name("thrifty") {}
+ name("thrifty") {}
int32_t number;
int64_t bigNumber;
@@ -560,7 +560,7 @@
} __isset;
...
}
-\end{verbatim}
+\end{verbatim}
\subsection{Case Analysis}
@@ -778,16 +778,16 @@
requests from multiple clients. For the Python and Java implementations of
Thrift server logic, the standard threading libraries distributed with the
languages provide adequate support. For the C++ implementation, no standard multithread runtime
-library exists. Specifically, robust, lightweight, and portable
+library exists. Specifically, robust, lightweight, and portable
thread manager and timer class implementations do not exist. We investigated
-existing implementations, namely \texttt{boost::thread},
+existing implementations, namely \texttt{boost::thread},
\texttt{boost::threadpool}, \texttt{ACE\_Thread\_Manager} and
-\texttt{ACE\_Timer}.
+\texttt{ACE\_Timer}.
While \texttt{boost::threads}\cite{boost.threads} provides clean,
lightweight and robust implementations of multi-thread primitives (mutexes,
conditions, threads) it does not provide a thread manager or timer
-implementation.
+implementation.
\texttt{boost::threadpool}\cite{boost.threadpool} also looked promising but
was not far enough along for our purposes. We wanted to limit the dependency on
@@ -801,7 +801,7 @@
ACE has both a thread manager and timer class in addition to multi-thread
primitives. The biggest problem with ACE is that it is ACE. Unlike Boost, ACE
API quality is poor. Everything in ACE has large numbers of dependencies on
-everything else in ACE - thus forcing developers to throw out standard
+everything else in ACE - thus forcing developers to throw out standard
classes, such as STL collections, in favor of ACE's homebrewed implementations. In
addition, unlike Boost, ACE implementations demonstrate little understanding
of the power and pitfalls of C++ programming and take no advantage of modern
@@ -820,17 +820,17 @@
\end{itemize}
As mentioned above, we were hesitant to introduce any additional dependencies
-on Thrift. We decided to use \texttt{boost::shared\_ptr} because it is so
+on Thrift. We decided to use \texttt{boost::shared\_ptr} because it is so
useful for multithreaded application, it requires no link-time or
runtime libraries (i.e. it is a pure template library) and it is due
to become part of the C++0x standard.
We implement standard \texttt{Mutex} and \texttt{Condition} classes, and a
- \texttt{Monitor} class. The latter is simply a combination of a mutex and
+ \texttt{Monitor} class. The latter is simply a combination of a mutex and
condition variable and is analogous to the \texttt{Monitor} implementation provided for
-the Java \texttt{Object} class. This is also sometimes referred to as a barrier. We
+the Java \texttt{Object} class. This is also sometimes referred to as a barrier. We
provide a \texttt{Synchronized} guard class to allow Java-like synchronized blocks.
-This is just a bit of syntactic sugar, but, like its Java counterpart, clearly
+This is just a bit of syntactic sugar, but, like its Java counterpart, clearly
delimits critical sections of code. Unlike its Java counterpart, we still
have the ability to programmatically lock, unlock, block, and signal monitors.
@@ -847,11 +847,11 @@
We again borrowed from Java the distinction between a thread and a runnable
class. A \texttt{Thread} is the actual schedulable object. The
-\texttt{Runnable} is the logic to execute within the thread.
-The \texttt{Thread} implementation deals with all the platform-specific thread
+\texttt{Runnable} is the logic to execute within the thread.
+The \texttt{Thread} implementation deals with all the platform-specific thread
creation and destruction issues, while the \texttt{Runnable} implementation deals
with the application-specific per-thread logic. The benefit of this approach
-is that developers can easily subclass the Runnable class without pulling in
+is that developers can easily subclass the Runnable class without pulling in
platform-specific super-classes.
\subsection{Thread, Runnable, and shared\_ptr}
@@ -875,7 +875,7 @@
With the weak reference in hand the \texttt{ThreadMain} function can attempt to get
a strong reference before entering the \texttt{Runnable::run} method of the
\texttt{Runnable} object bound to the \texttt{Thread}. If no strong references to the
-thread are obtained between exiting \texttt{Thread::start} and entering \texttt{ThreadMain}, the weak reference returns \texttt{null} and the function
+thread are obtained between exiting \texttt{Thread::start} and entering \texttt{ThreadMain}, the weak reference returns \texttt{null} and the function
exits immediately.
The need for the \texttt{Thread} to make a weak reference to itself has a
@@ -894,7 +894,7 @@
needs to know what \texttt{Runnable} object it is hosting. This interdependency is
further complicated because the lifecycle of each object is independent of the
other. An application may create a set of \texttt{Runnable} object to be reused in different threads, or it may create and forget a \texttt{Runnable} object
-once a thread has been created and started for it.
+once a thread has been created and started for it.
The \texttt{Thread} class takes a \texttt{boost::shared\_ptr} reference to the hosted
\texttt{Runnable} object in its constructor, while the \texttt{Runnable} class has an
@@ -903,30 +903,30 @@
\subsection{ThreadManager}
-\texttt{ThreadManager} creates a pool of worker threads and
+\texttt{ThreadManager} creates a pool of worker threads and
allows applications to schedule tasks for execution as free worker threads
-become available. The \texttt{ThreadManager} does not implement dynamic
+become available. The \texttt{ThreadManager} does not implement dynamic
thread pool resizing, but provides primitives so that applications can add
-and remove threads based on load. This approach was chosen because
-implementing load metrics and thread pool size is very application
+and remove threads based on load. This approach was chosen because
+implementing load metrics and thread pool size is very application
specific. For example some applications may want to adjust pool size based
on running-average of work arrival rates that are measured via polled
samples. Others may simply wish to react immediately to work-queue
depth high and low water marks. Rather than trying to create a complex
-API abstract enough to capture these different approaches, we
-simply leave it up to the particular application and provide the
+API abstract enough to capture these different approaches, we
+simply leave it up to the particular application and provide the
primitives to enact the desired policy and sample current status.
\subsection{TimerManager}
\texttt{TimerManager} allows applications to schedule
- \texttt{Runnable} objects for execution at some point in the future. Its specific task
+ \texttt{Runnable} objects for execution at some point in the future. Its specific task
is to allows applications to sample \texttt{ThreadManager} load at regular
intervals and make changes to the thread pool size based on application policy.
Of course, it can be used to generate any number of timer or alarm events.
The default implementation of \texttt{TimerManager} uses a single thread to
-execute expired \texttt{Runnable} objects. Thus, if a timer operation needs to
+execute expired \texttt{Runnable} objects. Thus, if a timer operation needs to
do a large amount of work and especially if it needs to do blocking I/O,
that should be done in a separate thread.
@@ -962,18 +962,18 @@
struct instances in the generated C++ code, this would actually be impossible.)
\subsection{TFileTransport}
-The \texttt{TFileTransport} logs Thrift requests/structs by
-framing incoming data with its length and writing it out to disk.
-Using a framed on-disk format allows for better error checking and
+The \texttt{TFileTransport} logs Thrift requests/structs by
+framing incoming data with its length and writing it out to disk.
+Using a framed on-disk format allows for better error checking and
helps with the processing of a finite number of discrete events. The\\
-\texttt{TFileWriterTransport} uses a system of swapping in-memory buffers
-to ensure good performance while logging large amounts of data.
+\texttt{TFileWriterTransport} uses a system of swapping in-memory buffers
+to ensure good performance while logging large amounts of data.
A Thrift log file is split up into chunks of a specified size; logged messages
-are not allowed to cross chunk boundaries. A message that would cross a chunk
-boundary will cause padding to be added until the end of the chunk and the
+are not allowed to cross chunk boundaries. A message that would cross a chunk
+boundary will cause padding to be added until the end of the chunk and the
first byte of the message are aligned to the beginning of the next chunk.
-Partitioning the file into chunks makes it possible to read and interpret data
-from a particular point in the file.
+Partitioning the file into chunks makes it possible to read and interpret data
+from a particular point in the file.
\section{Facebook Thrift Services}
Thrift has been employed in a large number of applications at Facebook, including
@@ -984,15 +984,15 @@
The multi-language code generation is well suited for search because it allows for application
development in an efficient server side language (C++) and allows the Facebook PHP-based web application
to make calls to the search service using Thrift PHP libraries. There is also a large
-variety of search stats, deployment and testing functionality that is built on top
+variety of search stats, deployment and testing functionality that is built on top
of generated Python code. Additionally, the Thrift log file format is
-used as a redo log for providing real-time search index updates. Thrift has allowed the
-search team to leverage each language for its strengths and to develop code at a rapid pace.
+used as a redo log for providing real-time search index updates. Thrift has allowed the
+search team to leverage each language for its strengths and to develop code at a rapid pace.
\subsection{Logging}
The Thrift \texttt{TFileTransport} functionality is used for structured logging. Each
service function definition along with its parameters can be considered to be
-a structured log entry identified by the function name. This log can then be used for
+a structured log entry identified by the function name. This log can then be used for
a variety of purposes, including inline and offline processing, stats aggregation and as a redo log.
\section{Conclusions}