Monday, January 13, 2014

Getting stock data

Some sites, like megabolsa, give end of day stock quotes.

You may download each file with
wget http://www.megabolsa.com/cierres/140102.txt

The information is in the following format


SYMB,DATE(yyyymmdd),APE, MAX,MIN,CLOSE,VOL

For this kind of files to be processed, a newline character has to be appended at the file end. This is achieved with
echo >> 140102.txt

To convert the comma separated file to a "|"separated file and change the data format from yyymmdd to yyyy-mm-dd, use
awk -F',' '{x=$2;yy=substr(x,1,4);mm=substr(x,5,2); dd=substr(x,7,2); $2=yy"-"mm"-"dd}1' OFS='|' 140102.txt > tmpfile.txt
Explanation:
  1. find a comma -F','
  2. date is in the second field x=$2
  3. get the substrings mm=substr(x,5,2)
  4. compose result $2=yy"-"mm"-"dd
  5. return 1 (?) and make the output file separator a pipe character '|'
We can do this for all our files, sort them, and send them to the same output file (make sure it does not exist before)
awk -F',' '{x=$2;yy=substr(x,1,4);mm=substr(x,5,2);dd=substr(x,7,2); $2=yy"-"mm"-"dd}1' OFS='|' 2014_diarios/*.txt | sort >>tmp.txt

Finally, do the same processing as in this post. First, insert empty columns to comply with the beancounter structure:
awk -F'|' '{$(NF)="0|0|0" FS $(NF);}1' OFS='|' tmp.txt | awk -F'|' '{$(3)="0" FS $(3);}1' OFS='|' > tmp.txt.cut
And then, insert into the database
sqlite3 val.db
sqlite> .import tmp.txt.cut stockprices



PD: You may join all of your files with cat
cat *.txt |sort > out.txt

No comments:

Post a Comment