Dr. MonkeyIQ: July 2013

Monday, July 29, 2013

GDrive mounting released!

So version libferris-1.5.18.tar.xz is hot off the make dist; including this much ado about mounting Google Drive support. The last additional feature I decided to add before rolling the tarball was support for viewing and adding to the sharing information of a file. It didn't really do much for me being able to "cp" a file to google://drive without being able to unlock it for given people I know to have access to it. So now you can do that from the filesystem as well.

So, since the previous posts have been about the GDrive API and various snags I ran into along the way, this post is about how you can actually use this stuff.

Firstly run up the ferris-capplet-auth app and select the GDrive tab. I know I should overhaul the UI for this auth tool, but since it's mostly only used once for a web service I haven't found the personal desire to beautify it. So inside the GDrive tab, clicking on the "Authenticate with GDrive" button opens a dialog (should become a wizard), the first thing to do as it tells you is visit the console page on google to enable the GDrive API. Then click or paste the auth link in the dialog to allow libferris to get its hands on your data. The auth link goes to google and tells you what libferris is wanting. When you OK that you are given a "code" that you have to copy and paste back into the lower part of the auth capplet this dialog window. Then OKing the dialog will have libferris get a proper auth token from google and you are all set.

So to get started the below command will list the contents of your GDrive:

$ ferrisls google://drive

To put a file up on there you can do something like;

$ date >/tmp/sample.txt
$ ferriscp /tmp/sample.txt google://drive

And you can get it back with cat if you like. Or ferriscp it somewhere else etc.

$ fcat google://drive/sample.txt
Mon Jul 29 17:21:28 EST 2013

If you want to see your shares for this new sample file use the "shares" extended attribute.

$ fcat -a shares google://drive/sample.txt
write,monkeyiq

The shares attribute is a BINEBO (Bytes In Not Equal Bytes Out). Yay for me coining new terms! This means that what you write to it is not exactly what you will get when you read back from it. The handy part of that is that if you write an email address into the extended attribute, you are adding that person to the list of folks who can write to the file. Because I'm using libferris without FUSE and bash doesn't understand libferris URLs, I have to use ferris-redirect in the below command. You can think of ferris-redirect like the shell redirection (>) but you can also supply the extended attribute to redirect data into with (-a). If I read back the shares extended attribute I'll see a new entry in there. Google will have sent a notification email to my friend with a link to the file for me also.

$ echo niceguy@example.com \
| ferris-redirect -a shares google://drive/sample.txt
$ fcat -a shares google://drive/sample.txt
write,monkeyiq
write,Really Nice Guy

I could also add some hookup to your "contacts" to this, so your evolution addressbook nick names or google contacts could be used to lookup a person. In this case, with names changed to protect the innocent etc, so hypothetically google thinks the name for that email address is Really Nice Guy because he is in my contacts on gmail.

All of this extends to other virtual filesystem that libferris supports. You can "cp" from your scanner or webcam or a tuple of a database directly to google drive if that floats your boat.

I've already had a bit of a sniff at the dropbox API and others, so you might be able to bounce data between clouds in a future release.

Saturday, July 27, 2013

The new google://drive/ URL!

The very short story: libferris can now mount Google Drive as a filesystem. I've placed that in google://drive and will likely make an alias from gdrive:// to that same location so either will work.

The new OAuth 2.0 standard is so much easier to use than the old 1.0 version. In short, after being identified and given the nod once by the user, in 2.0 you have to supply a single secret, in 1.x you have to use per message nonce, create hashes, send the key and token, etc. The main drawback of 2.0 is that you have to use TLS/SSL for each request to protect that single auth token. A small price to pay, as you might well want to protect the entire conversation if you are doing things that require authentication anyway.

A few caveats of the current implementation: mime types on uploaded files are based on file name sniffing. That is because the upload you might be using cp foo.jpg google://drive and the filesystem copies the bytes over. But GDrive needs to know the mimetype for that new File at creation time. The GDrive PATCH method doesn't seem to let you change the mimetype of a file after it has been sent. A better solution will involve the cp code prenotifying the target location so that some metadata (mimetype) can be prefetched form the source file if desired. That would allow full byte sniffing to be used.

Speaking of PATCH, if you change metadata using it, you always get back a 200 response. No matter what. Luckily you also get back a JSON file string with all the metadata for the file you have (tried to) updated. So I've made my PATCH caller code to ignore the HTTP response code compare the returned file JSON to see if the changes actually stuck or not. If a value isn't set how it is expected my PATCH returns an exception. This is in contrast to the docs for the PATCH method which claims that the file JSON is only returned "if successful".

Oh yeah, one other tiny thing about PATCH. If you patch the description it didn't show up in Firefox for me until I refreshed the page. Changing the title does update the Firefox UI automatically. I guess the sidepanel for description hasn't got the funky web notification love yet.

There are two ways I found to read a directory, using files/list and children/list. Unfortunately the later, while returning only the direct children of a folder, also only returns a few pieces of information for those children the most interesting being the child's id. On the other hand the files/list gives you almost all the metadata for each returned File. So on a slower link, one doesn't need thinking music to work out if one round trip or two are the desired number. The files/list also returns metadata for files that have been deleted, and files which other's have shared with you. It is easy to set a query "hidden = false and trashed = false" for files/list to not return those dead files. Filtering on the server exclusively for files that you own is harder. There is a query alias sharedWithMe but no OwnedByMe to return the counter set. I guess perhaps "not sharedWithMe" would == OwnedByMe.

Currently I sort of ignore the directory hierarchy that files/list returns. So all your drive files are just in google://drive/ instead of subdirs as appropriate. I might leave that restriction in the first release. It's not hard to remove, but I've been focusing on upload, download, and metadata change.

Creating files, updating metadata, and downloading files from GDrive all work and will be available in the next libferris release. I have one other issue to cleanup (rate limiting directory read) before I do the first libferris release with gdrive mounting.

Oh and big trap #2 for the young players. To actually *use* libferris on gdrive after you have done the OAuth 2.0 "yep, libferris can have access" you have to go to code.google.com/apis/console and enable drive API for your account otherwise you get access denied errors for all. And once you goto the console and do that, you'll have to OAuth again to get a valid token.

A huge thank you for those two contributed to the ferris fund raising after my last post proposing mounting Google Drive!

Monday, July 22, 2013

Mounting Google Drive?

So on the heels of resurrecting and expanding the support for mounting vimeo as a filesystem using libferris I started digging into mounting Google Drive. As is normally the case for these things, the plan is to start out with listing files, then uploading files, then downloading files, then updating the metadata for files, then rename, then delete, and with funky stuff like "tail -f" and append instead of truncate on upload.

One plus of all this is that the index & search in libferris will then extend it's claws to GDrive as well as desktop files. As I&S is built on top of the virtual filesystem and uses the virtual filesystem to return search results.

For those digging around maybe looking to do the same thing, see the oauth page for desktop apps, and the meat seems to be in the Files API section. Reading over some of the API, the docs are not too bad. The files.watch call is going to take some testing to work out what is actually going on there. I would like to use the watch call is for implementing "tail -f" semantics on the client. Which is in turn most useful with open(append) support. The later I'm still tracking down in the API docs, if it is even possible. PUT seems to update all the file, and PATCH seems very oriented towards doing partial metadata updates.

The trick that libferris uses of exposing the file content through the metadata interface seems to be less used by other tools. With libferris, using fcat and the -a option to select an extended attribute, you can see the value of that extended attribute. The content extended attribute is just the file's content :)

$ date > df.txt
$ fcat -a name df.txt
df.txt
$ fcat -a mtime-display df.txt
13 Jul 23 16:33
$ fcat -a content df.txt
Tue Jul 23 16:33:51 EST 2013

Of course you can leave out the "-a content" part to get the same effect, but anything that is wanting to work on an extended attribute will also implicitly be able to work on the file's byte content as well with this mechanism.

If anyone is interested in hacking on this stuff (: good ;) patches accepted. Conversely if you would like to be able to use a 'cp' like tool to put and get files to gdrive you might consider contributing to the ferris fund raising. It's amazing how much time these Web APIs mop up in order to be used. It can be a fun game trying to second guess what the server wants to see, but it can also be frustrating at times. One gets very used to being able to see the source code on the other side of the API call, and that is taken away with these Web thingies.

Libferris is available for Debian Hard Float and Debian armel soft floating point. I've just recently used the armhf to install ferris on an OMAP5 board. I also have a build for the Nokia N9 and will update my Open Build Service Project to roll fresh rpms for Fedora at some stage. The public OBS desktop targets have fallen a bit behind the ARM builds because I tend to develop on and thus build from source on desktop.

Saturday, July 20, 2013

Like a Bird on a Wire(shark)...

Over recent years, libferris has been using Qt to mount some Web stuff as a filesystem. I have a subclass of QIODevice which acts as an intermediary to allow one to write to a std::ostream and stream that data to the Web, over a POST for example. For those interested, that code is in Ferris/FerrisQt.cpp of the tarball. It's a bit of a shame that Qt heavy web code isn't in KIO or that the two virtual filesystems are not closer linked, but I digress.

I noticed a little while ago that cp to vimeo://upload didn't work anymore. I had earmarked that for fixing and recently got around to making that happen. It's always fun interacting with these Web APIs. Over the time I've found that Flickr sets the bar for well documented APIs that you can start to use if you have any clue about making GET and POST etc. At one stage google had documented their API in a way that you could never use it. I guess they have fixed that by now, but it did sort out the pretenders from those two could at least sniff HTTP and were determined to win. The vimeo documentation IIRC wasn't too bad when I added support to upload, but the docs have taken a turn for the worst it seems. Oh, one fun tip for the young players, when one API call says "great, thanks, well done, I've accepted your call" and then a subsequent one says "oh, a strange error has happened", you might like to assume that the previous call might not have been so great after all.

So I started tinkering around, adding oauth to the vimeo signup, and getting the getTicket call to work. Having getTicket working meant that my oauth signed call was accepted too. I then was then faced with the upload of the core data (which is normally done with a rather complex streaming POST), and the final I'm done, make it available call. On vimeo that last call seems to be two calls now, first a VerifyChunks call and then a Complete call.

So, first things first. To upload you call getTicket which gives you an endpoint that is an HTTP URL to send the actual video data to, as well as an upload ticket to identify the session. If you try to post to that endpoint URL and the POST converts the CGI parameters using multipart/form-data with boundaries into individual Content-Disposition: form-data elements, you loose. You have to have the ticket_id in the URL after the POST text in order to upload. One little trap.

So then I found that verifyChunks was returning Error 709 Access to the chunk list failed. And that was after the upload had been replied to with "OK. Thanks for the upload.". Oddly, I also noticed that the upload of video data would hang from time to time. So I let the shark out of the pen again, and found that vimeo would return it's "yep were done, all is well" response to the HTTP POST call at about 38-42kb into the data. Not so great.

Mangling the vimeo.php test they supply to upload with my oauth and libferris credentials I found that the POST had a header Expect: 100-continue. Right after the headers were sent vimeo gave the nod to continue, and then the POST body was sent. I assume that just ploughing through and giving the headers followed by the body confused the server end and thus it just said "yep, ok, thanks for the upload" and dropped the line. Then of course forgot the ticket_id because there was no data for it, so the verifyChunks got no chunk list and returned the strange error it did. mmm, hindsight!

So I ended up converting from the POST the newly available PUT method for upload. They call that their "streaming API" even though you can of course stream to a POST endpoint. You just need to frame the parameters and add the MIME tailer to the POST if you want to stream a large file that way. Using PUT I was then able to verify my chunks (or the one single chunk in fact) and the upload complete method worked again.

In the end I've added oauth to my vimeo mounting, many thanks to the creators of the QOAuth library!