Uploading numerous files with Paperclip

May 29th, 2011

The Conundrum:

On my current Rails application, I was faced with solving a specific problem: how do I use Paperclip to store many files generated by a single processor without being able to predict how many files there will be or what their names will be.

More Background:

With Paperclip, my project uploads a data file and runs a proprietary processor on it. When that processor is done, it will have created between 5 and ~10 files. One of those files is a metadata file which contains the names of the other files designed to be read by a flash/javascript consumer.

These filenames define the relationship of the output file to the original, and need to be kept. Unfortunately, they define their relationship via a floating point number in the filename, so I can't predict the names ahead of time.

The implementation:

First, crack open the Paperclip::Attachment class and expose some instance variables for meddling:

This is where alarm bells should be going off in your head. Yes, I am taking a hacksaw and a carving knife to Paperclip, but we'll discuss the negatives of my approach later.

So now that we can manipulate the queues for storing and deleting files now how do we add them to the queue at the appropriate time? That's the easy part:

That is enough to get the files stored together. The problem now is that their filenames are mangled. if you are lucky enough that your output files are the same extension as the main file, then all you have to do is remove the extension off of the key in @attachment.queued_for_write.

If you are unlucky like I was, its a little more.... difficult.

The Interpolations class is how Paperclip decides on filenames. For instance, if you used a path with :style.:extension (which is what we used, and I believe the default) then you face that your styelname contains the period and the extension because of how we added the "style" to the queued_for_write hash.

The if statements in extension and style are specific to your case. Mine was checking to see if the class of the style was a string (since all my other style definitions are symbols) and if it looked like a file with certain extensions (so: /.(mov|mpe?g|avi)/ and so on).

Now files should be getting stored, with their original names and extensions. Were done, right?

Not so much. Now that the files are getting stored, they just sit there and delete calls from Paperclip don't know that they exist, so they will never get deleted.

This is why we added the second attr_accessor to Paperclip::Attachment. First thing we need is a way to keep track of the files that is persistent. To the model batman! Create a migration to add a field on to the model called something like extra_files with a type of text. In the model, add:

Then, in the processor add the line:

And open back up the Paperclip::Attachment class and add:

And there you have it.

Most of these changes live in one file on my application, only overwriting the functions needed.

What is wrong with it?

Its butchery. We have exposed several internal functions of Paperclip's and overridden them. This works for now, but has the potential to break in spectacular and violent ways with a gem update.

What could we do instead?

I have thought of several ways this could be done instead of what I did.

  1. Subclass Paperclip::Attachment and insert the logic instead of just aliasing out the functions.

  2. Create a new model which has an attachment for each "extra" file.

Then why didn't you do it that way?

For #1, I only saw it as being worse than my approach in terms of maintainability. The more you overwrite in a class the more it depends on the existing gem working as-is.

#2 was slightly more attractive, but since my situation required all the files to be located in the same directory, as well as the fact that it would have required the processor to create 5-10 records for every one record in the parent table, it didn't appeal.

So what are you going to do about it?

I'm looking for a better way that I missed. Suggestions are appreciated.


0 replies

New comments are disabled.