Why I built downldr
The problem I wanted to solve
I wanted a file downloader that checked that the file being downloaded was the file type I wanted to download, to avoid saving invalid files. The actual file type, not what the Content-Type
header says!
In any case, I didn't find a package that allowed filtering, not even with Content-Type
header, so I decided to build one!
If you know any, please leave a comment with the name of the package.
What is downldr?
Is a file downloader, that allows file type filtering, and conditional piping, so if the downloaded file is incorrect, or the request fails, you don't end up with an empty file.
Where can I get it?
You can get it from npm
npm install downldr
How does the filtering works
Using the extension or the Content-Type
header to detect the file type is not bullet proof, since the header can be set to any value, and usually defaults to the type detected by the extension, and if the extension is lacking, it may come as application/octet-stream
We all know, that we can do the following:
mv video.mp4 image.jpg
Now we have a video, with jpg
extension, but it's a video in the end.
So if we were to download that file, to later use it in our application, we would have unexpected results.
downldr('https://example.com/image.jpg')
.pipe(fs.createWriteStream(path.join(__dirname, 'images', 'image.jpg'));
Luckily for us downldr
comes with a filter
option, that allow us to filter which files we want to download, and it does not use the Content-Type
nor the extension to detect it .
Most files have a signature, that allow us to identify or verify the content. That signature is often called magic numbers or magic bytes. downldr
uses the file-type
package under the hood, to provide this functionality.
But if you wanted to implement it yourself, for a specific file type, is not that hard to do it.
For example, the hex signature for a png
file is: 89 50 4E 47 0D 0A 1A 0A
In Node.js, that translates to: Buffer.from([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]);
So the code for implementing a png detector, is the following:
function isPNG(buffer) {
const magicBytes = Buffer.from([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]);
return buffer.indexOf(magicBytes) === 0;
}
fs.createReadStream('image.png') // an actual png :)
.once('data', chunk => {
console.log(`png: ${isPNG(chunk)}`); // png: true
});
fs.readFile('image.jpg', (err, content) => {
console.log(`png: ${isPNG(chunk)}`); // png: false
});
Note that we only need a few bytes, so sending the first chunk of a stream is enough!
Now let's see, how file filtering is done in downldr
.
downldr('https://example.com/image.jpg', {
filter: (type, chunk, statusCode) => {
// type.mime may be undefined for some files
console.log(type.contentType); // image/jpg
console.log(type.mime); // video/mp4
return type.mime === 'image/jpeg';
},
// For the above filtering, it will be out.jpg
target: (ext) => fs.createWriteStream(`out.${ext}`)
})
// Error: Invalid type: video/mp4 - Status Code: 200
.on('error', console.error)
.on('complete', () => console.log('done!'));
When using target
option instead of .pipe
it works like a conditional pipe.
It will only create the file stream, once the filter
function returns true, that way we avoid creating an empty file!
You can check the documentation for more options and advanced usage!