Monday, September 17, 2012

Workaround for Copy Command from WebHDFS

At the moment the WebHDFS api doesn't offer the Copy command. As a result, the client ends up having to download the file to the local disk and re-upload the files via the Create command. Since this ends up being a lot of round trips all the way to the client (typically a non Java based client) the following workaround can be set up to partly alleviates the problem.

Set up a HDFS Webdav server on one of the DN or NN boxes. Issue the Copy command to the Webdav server via a REST call. Free up the client application, while letting the Webdav server with much better connectivity & proximity to the HDFS complete the Copy command request.

No comments:

Post a Comment