Papers Page Code

From GersteinInfo

Revision as of 15:14, 16 September 2011 by Public (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

SpreadSheet Structure

"Papers Page" is generated from two google spreadsheets, "Papers Master" and "Papers Subjects". "Papers Master" stores basic information about each paper. "Papers Subjects" stores information about each grant.

Here is a list of the tags and their meanings:

Papers Master:
<labid> - id by which to refer to the article
<PMID> - PubMed id
<title> - title of the article
<citation> - citation of the article (author, journal, year, etc)
<preprint> - URL of the preprint file
<subjects> - specifies the grant(s) funding the paper (e.g. "cegs,keck")
<website> - supplemental website
<Year> - year the article was published
<footnote> - additional information
<website2> - second supplemental website

The tags can conceptually be divided into two groups: ones such as PMID and title, which serve to identify the paper, and tags such as website and subject which supply supplemental information about the paper. There are two ways to identify a paper (in order of decreasing precedence):

I. PMID
II. title, citation

You should always include the PMID if a paper is known to be listed in PubMed. Option 2 should be used for papers that are in press.

The other group of tags supplies additional information about the paper specified by the first group of tags. All of these tags are optional, however used of <subjects> and <preprint> is strongly encouraged.

labid	PMID	title	citation	preprint	subject	website	Year	footnote	website2
metamembrane	20430783			http://archive.gersteinlab.org/papers/e-print/metamembrane/preprint.pdf	interactions	http://metagenomics.gersteinlab.org/membrane	2010

Papers Subjects:
<category> - classification of grants
<labid> - id refer to each grant
<title> - description of each grant
<website> - external website
<html> - additional information

We encourage you to sort <category> after adding new grants because of coding issues. <website> should also be reflected in <html> section. For example, "don" has <website> "http://www.donaghue.org", also "URL: <A HREF=http://www.donaghue.org> http://www.donaghue.org</A>" in <html> section.

category	labid	title	website	html
Research Grants	don	Dongahue Young Investigator	http://www.donaghue.org/	Young investigator award from Donaghue Foundation to M Gerstein (PI), "Comparative Genomics of Microbial Pathogens," (DF98-113, 1/1/99-12/31/03). URL: <A HREF=http://www.donaghue.org> http://www.donaghue.org</A>Articles funded by this grant:

Generate publication documents from SpreadSheet

Two steps: (Detailed flowchart, refer to Flowchart)

Download XML file from NCBI using PubMed ID to generate pubmed_spreadsheet. Pubmed_spreadsheet stores <title> <citation> et al. of papers corresponding to "PMID" in "Papers Master". This step is done by scripts automatically.

GoogleSpreadsheet.py: grab googlespreadsheet with python, see Grab_GoogleSpreadsheet_with_a_Python
Other Code: see PubmedSpreadsheet_Generation_Code

Pipeline:

First obtain pubmed_result.xml from papers medline query
   parse_pmids.py
   curl `cat ncbiquery.txt` > NCBIData.xml
Reformat NCBIData.xml to tab delimited file to upload to Google
   python import.py
replace PubMed Import XML spreadsheet with export_out.tab
   (this can be added to the bottom of PubmedHandler.py instead of exporting the export.tab file using the NewGoogleSpreadsheet.py API)

Build Papers Page

This step grabs all information from three spreadsheets, "Papers Master","Papers Subjects" and "Pubmed_Spreadsheet", to build up the whole website.

update.py : Generate whole website, see Build_Papers_Page_Code

Lab Member: Papers Page Rebuild and Further Info (Private)

Private Wiki includes instructions about how to update papers page and further info, click here Private Wiki

Papers Page Code (Old)

Papers_Page_Code_Old

Old Papers Server

Old version of pagers website Old Papers

Other info of papers Paper_search

Papers Page Code

From GersteinInfo

Contents

SpreadSheet Structure

Generate publication documents from SpreadSheet

Lab Member: Papers Page Rebuild and Further Info (Private)

Papers Page Code (Old)

Old Papers Server

Views

Personal tools

GersteinLab Public Wiki

Search

Toolbox