HTTP Raw Body

HTTP Raw Body

The request is empty!

This is by far the most frequent misunderstanding with PubSubHubbub. They can see the HTTP POST request in their logs, but for some reason, they’re unable to access its content. The reason for this is that most web frameworks and languages assume that POST requests are sent by forms and will usually expose the parsed version of the raw body. If they can’t parse it (because the Content-Type header does not match), they’ll show it as empty.

For each language and framework, we list the ways to access the raw body of an HTTP post request.

PHP

PHP exposes $_POST but that’s hopelessly empty when the data is not multipart/form-data or application/x-www-form-urlencoded. The PHP docs are pretty clear: you need to use php://input which is a is a read-only stream.

$entityBody = file_get_contents('php://input');

However, the trick is that this stream can only be read once, so if you do it, make sure you copy the data somewhere so you can access it again. Unfortunately, frameworks are probably already reading from that stream.

Symphony

Symphony provides Request objects to access the internals of the HTTP requests. These objects have a getContent method which you can use to access the string representation of the raw body. Laravel, Drupal, eZPublish and all PHP frameworks based on Symphony use a similar mechanism.

// Laravel example.
$request = Request::instance(); // Access the instance
$request->getContent(); // Get its content

CakePHP

Cake uses another approach and lets you define a callback wich will be called to handle the data from the request:

// For JSON bodies, you'll want to use the json_decode function:
$json = $this->request->input('json_decode');
// For XML/Atom, you might use the builder:
$xml = $data = $this->request->input('Xml::build', ...);

Node.js

Node.js is one of these platforms which do not try to parse the raw body of the POST request. By default, Node does not even read the body of POST requests.

Here’s an example of a very basic echo server. The only trick is that the data may have been truncated, which means we need to append any data we get to a buffer.

var http = require('http');
var s = http.createServer(function (req, res) {
  var raw = '';
  req.on('data', function(d) {
    raw += d; 
  });
  req.on('end', function() {
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.end(raw);   
  });
});
s.listen(9999);

Express & Connect

Express (or anything built on Connect) provides a lot of syntactic sugar on top of Node’s default HTTP APIs. However, you can always “revert” to the Node.js way which means you can use the technique above to get the raw body. Another popular option is to offload this to a middleware by using a pipe.

Here’s an example which writes to a concat stream. We assign the full body to the request for handling further down the middleware chain.

var concat = require('concat-stream');
app.use(function(request, response, next){
  request.pipe(concat(function(data){
    request.body = data; 
    next();
  }));
});

Hapi

Hapi does not use middlewares but can be configured to handle POST requests differently. You should use one of the following values for the payload configuration:

  • parse is the default. Hapi will assign both rawBody and payload to your requests objects with respectively a raw buffer of the POST body and its parsed value.
  • stream lefts the POST body untouched. You can access Node’s Reqest object using raw.req on the Request object.
  • raw will just assign rawBody to the request object with the content of the body.

Ruby

Ruby itself does not provide an HTTP parsing library to handle requests outside of a web framework.

Rack

Rack is the common denominator between Ramaze, Sinatra and many other micro frameworks. It provides some helpers which can be convenient when handling HTTP requests, in the form of Rack::Request. If you’re looking for the raw HTTP body, check the rack.input.

It’s an IO stream which can be to read to access the content of the request. It’s passed to the ruby object using body or @env["rack.input"].

Here’s a Sinatra example:

post "/path" do
  request.body.rewind  # back to the head, if needed
  data = request.body.read
  "#{data}" # echo server!
end

Rails

Ruby On Rails controllers have a request method to access the HTTP request object. These objects have a raw_post method to get the raw body of any request.

Our Rails Engine uses this exact technique to access the raw body and compute the signature.

Python

Similar to Ruby, Python itself does not have an HTTP library.

Django

Django is arguable the most popular python web framework. For each request received, Django creates an HttpRequest object that contains metadata about the request. The body property contains the bytes of the raw request.

Flask

Flask is a microframework for Python. It’s compliant with WSGI which is Python’s main Web Server Gateway Interface.

from flask import Flask
app = Flask(__name__)
@app.route('/', methods=['POST'])
def parse_request():
    data = flask.request.get_data()

AppEngine

AppEngine is a popular platform for running Python applications. The framework and runtime provided by Google define a RequestHandler Class, which can is instantiated for each request. Accessing the raw body is then trivial as it’s a property of the request object.

Here’s an example:

class myHandler(webapp.RequestHandler):
def __init__(self):
  ...
def post(self):
  return self.request.body

C# (ASP.NET)

Using the Request.InputStream property, data can be read in a raw fashion as binary, or using a System.IO.StreamReader it can be read as test. This can be done multiple times.

string value;
using (System.IO.StreamReader SR = new System.IO.StreamReader(Request.InputStream))
{
    value=SR.ReadToEnd();
}

Erlang

Cowboy

Accessing the request body is a snap in Cowboy, a small, fast, modular HTTP server for Erlang applications. Use the this call in your handler function:

{ok, Body, Req2} = cowboy_req:body(Req1).

Haskell

For web applications in the Haskell programming language there exists a shared Web Application Interface, quite similar to what Rack is to Ruby.

Yesod

The Yesod web framework provides access to the Wai request in handlers. This lets you read the body into a lazy ByteString, so that you can start consuming the data while it’s being received.

import Yesod
import qualified Network.Wai as Wai

-- |Handler function
postEndpointR = do
  body <- getRequest >>= Wai.lazyRequestBody

There’s more!

We’re missing several! Go, Scala, Java, Perl… etc! Please, help us by leaving details in the comments or by sending a pull request.

Thanks a lot to astro and AyrA for their contributions!

Liked this post? Read the archive or

On the same topic, check pubsubhubbub is webhooks with benefits, pubsubhubbub v0.4 and ping me i'm famous.

Previously, on the Superfeedr blog: Async Notification Replays.