'Scrapy: command to overwrite previous export file

Set-up

I export my data to a .csv file by the standard command in Terminal (Mac OS), e.g.

scrapy crawl spider -o spider_ouput.csv 

Problem

When exporting a new spider_output.csv Scrapy appends it to the existing spider_output.csv.

I can think of two solutions,

  1. Command Scrapy to overwrite instead of append
  2. Command Terminal to remove the existing spider_output.csv prior to crawling

I've read that (to my surprise) Scrapy currently isn't able to do 1. Some people have proposed workarounds, but I can't seem to get it to work.

I've found an answer to solution 2, but can't get it to work either.

Can somebody help me? Perhaps there is a third solution I haven't thought of?



Solution 1:[1]

There is an open issue with scrapy for this feature: https://github.com/scrapy/scrapy/issues/547

There are some solutions proposed in the issue thread:

scrapy runspider spider.py -t json --nolog -o - > out.json

Or just delete output before running scrapy spider:

rm data.jl; scrapy crawl myspider -o data.jl

Solution 2:[2]

option -t defines the file format like json, csv, ...

option -o FILE dump scraped items into FILE (use - for stdout)

>filename pipes output to filename

altogether we get for overwriting previous export file:

replace output file instead of appending:

scrapy crawl spider -t csv -o - >spider.csv

or for json format:

scrapy crawl spider -t json -o - >spider.json

Solution 3:[3]

Use big O:

scrapy crawl spider -O spider_ouput.csv 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Granitosaurus
Solution 2
Solution 3 Suraj Rao