Project

Making Thrift Service stable for random data

Student Name

Rajeev Sampath

Email

rjvranga@gmail.com

TimeZone

GMT +5.30

Project Title

Making Thrift Service stable for random data

Abstract

Apache Thrift[1] is a framework for cross language services development, which enables to build services that operate across several languages such as Java, C++, Python etc. There has been a JIRA bug report[2] that Thrift Java service can be crashed by sending random data, which makes it extremely vulnerable to accidents and Denial of Service attacks. This project aims to find a solution to this problem to make Thrift Service stable for random data.

Description

Apache Thrift is a framework which can be used for cross-language services development [1]. The Java implementation of the Thrift service currently has a vulnerability that it is being unable to handle random incoming chunks of data properly, which makes the service to severely fail with out of memory errors. Since Thrift is meant to be used for RPC calls, being able to accept random data might appear to be out of context. However, the vulnerabilities which are implicitly caused by this issue makes it crucial to fix it in order to make the Thrift service run in a reliable manner in a production environment.

The current proposed solutions for this issue include having a Thrift-wide limit for method names, advertising the length of framed transports (and quite strange approaches such as trying to catch out of memory errors). Though none of them fully solve the problem, they can provide useful and valuable clues for this project.

The goal of the project is finding a solution to this issue with minimum performance trade-offs as possible. This is likely to become a more experiment-based project with several iterations until an acceptable solution is found.

For more information related to this issue, please refer to the JIRA bug report [2].

Things done so far

So far I have read the Thrift whitepaper[3] in order to get an overall idea of the framework. I have explored relevant areas in the source code, as pointed in the JIRA bug report. In addition, I have gone through the proposed solutions available in the JIRA page to get an idea of the problem, though they do not fully solve the problem to make the Thrift service stable for random data.

Project Timeline

Period

Activities

April 24 – May 26

Community bonding period. This period will mainly be used to get familiarized with Thrift and its community.

May 27 – July 11

The problem solving starts. The problems with the current implementation need to be figured out correctly. Then possible solutions need to be proposed, implemented, reviewed and tested. This is likely to become a more experiment-based iterative task. There may be a need to completely re-implement some relevant parts of the message processing in order to get this solved. The solutions need to be tested for both correctness and possibly for other aspects such as performance.

July 12 – July 16

Mid term evaluations period. By this time, the goal is to identify or implement one solution which can at least partially solve the problem.

July 17 – August 9

Continuation of the task. Depending on the progress, this period will be used for implementation and/or possible optimizations to the implementation. A final profiling and a testing phase needs to be carried out to test the performance and the correctness of the solution.

August 10 – August 16

Documentation (and code refactoring if necessary).

August 16 – August 20

All the deliverables will be submitted during this period.

Deliverables

Community interaction

So far I have joined both Thrift dev and user mailing lists and will be using the mailing lists to communicate with the community. I also intend to offer help for the users on questions related to the areas I'll be getting familiar with whenever possible. I can update the mentor(s) on my progress during the project period, preferably a couple of times per week or more frequent basis. Regularly interacting with the community will be essential for this project and I will always appreciate the feedback from the community to make it successful.

About Me

I'm Rajeev Sampath, currently a final-year Computer Science undergraduate student from University of Moratuwa, Sri Lanka. My interests are Distributed Computing and Algorithms.

I became a Sun Certified Java Programmer in 2006 and since then I have worked in many areas related to Java – including mobile applications development (Java ME), web apps (Servlets/JSPs), OSGi and Web Services (WS-* and REST). I have some coding experience in C++ and C# as well. Our final year research project is to introduce a Scalable and a Fault-tolerant architecture for distributed Complex Event Processing systems where I achieved a lot of experience related to Distributed Systems and Complex Event Processing. One of the other major projects I've involved in so far is a Stock Market dissemination service developed during my Internship period.

I'm generally passionate on Algorithmic Problem Solving and usually take part in TopCoder and also other online programming contests whenever I have free time.

Since I'm interested in exploring and hacking Thrift and also due to my experience related to Distributed Computing, I believe I'll be able to complete this project successfully and continue contributing to the project.

References

[1] http://incubator.apache.org/thrift/

[2] https://issues.apache.org/jira/browse/THRIFT-601

[3] http://incubator.apache.org/thrift/static/thrift-20070401.pdf