Geeky post again - no math this time, but computer code.
I'm sure people have done this before, but I thought it would be a nice opportunity to practice my Python skills to write a small script for the following problem.. Usually when I read a scientific article I watch out for the following elements:
- Innovation: what does the study do what others haven't done before?
- Method: what method did they use?
- Data: where did they get their data from?
- Results: what are the main results?
- Relevance: who benefits from this research, and how?
I also like to place the research in one of the four quadrants in
this post. I find it helpful to make an overview of these questions in a table:
1st author | Year | Journal | Quadrant | Innovation | Method | Data | Results | Relevance |
Kompas | 2005 | Journal of Productivity Analysis | 4 | Estimates efficiency gains quota trade for Southeast Trawl Fishery, AU | Stochastic frontier analysis | AFMA and ABARE survey data | ITQs gave efficiency gains | Policy debate on ITQs |
Kompas | 2006 | Pacific Economic Bulletin | 3 | Estimates optimal effort levels and allocation across species | Multifleet, multispecies, multiregion bioeconomic model | SPC data | Effort reduction needed; optimal stocks larger than BMSY | Policy debate on MEY |
But here's the problem: I usually make my notes in a bibtex file (as a good geek should), which looks like this:
@ARTICLE{Kompas2006PacEconBull,
author = {Kompas, T. and Che, T.N.},
title = {Economic profit and optimal effort in the Western and Central Pacific tuna fisheries},
journal = {Pacific Economic Bulletin},
year = {2006},
volume = {21},
pages = {46-62},
number = {3},
data = {SPC data},
innovation = {Estimates optimal effort levels and allocation across species},
quadrant = {3},
keywords = {tuna; bioeconomic model; optimisation; Pacific},
method = {Multifleet, multispecies, multiregion bioeconomic model},
results = {Effort reduction needed; optimal stocks larger than BMSY},
relevance = {Policy debate on MEY}
}
@ARTICLE{Kompas2005JProdAnalysis,
author = {Kompas, Tom and Che, Tuong Nhu},
title = {Efficiency gains and cost reductions from individual transferable quotas: A stochastic cost frontier for the Australian South East fishery},
journal = {Journal of Productivity Analysis},
year = {2005},
volume = {23},
pages = {285-307},
number = {3},
quadrant = {3},
data = {AFMA and ABARE survey data},
innovation = {Estimates efficiency gains quota trade for Southeast Trawl Fishery, AU},
keywords = {individual transferable quotas; stochastic cost frontier; fishery efficiency; ITQs},
method = {Stochastic frontier analysis},
relevance = {Policy debate on ITQs.},
results = {ITQs gave efficiency gains}
}
I don't want to copy it all by hand, so I wrote
this little script in Python to convert all entries in the bibtex file to a csv file:
import csv
from bibtexparser.bparser import BibTexParser
from dicttoxml import dicttoxml
from operator import itemgetter
def readFirstAuthor(inpList,num):
author1 = ""
x = inpList[num]['author']
for j in x:
if j != ',':
author1+=j
else:
break
return author1
def selectDict(inpList,name):
outObj = []
for i in range(len(inpList)):
if name in inpList[i]['author'] and \
inpList[i]['type']=='article':
outObj.append(inpList[i])
return(outObj)
def selectFieldsDict(inpList,fieldNames):
outObj = []
for i in range(len(inpList)):
temp = {}
for n in fieldNames:
if n == 'author':
author1 = readFirstAuthor(inpList,i)
temp['author'] = author1
else:
if n in inpList[i]:
temp[n] = inpList[i][n]
else:
temp[n] = 'blank'
outObj.append(temp)
return(outObj)
fieldnames = ['author','year','journal','quadrant',\
'innovation','method','data','results','relevance']
with open('BibTexFile.bib', 'r') as bibfile:
bp = BibTexParser(bibfile)
record_list = bp.get_entry_list()
record_dict = bp.get_entry_dict()
dictSelection = selectDict(record_list,'Kompas')
fieldSelection = selectFieldsDict(dictSelection,fieldnames)
test = sorted(fieldSelection, key=itemgetter('year'))
test_file = open('output.csv','wb')
csvwriter = csv.DictWriter(test_file, delimiter=',',\
fieldnames=fieldnames)
csvwriter.writerow(dict((fn,fn) for fn in fieldnames))
for row in test:
csvwriter.writerow(row)
test_file.close()
If you are a Python developer: any comments on this are welcome. I'm sure it's not perfect.