ACRE and its numerous allied projects have built up many years of experience in digitising data. The following reference sources have proven useful in their work and will help those getting involved for the first time. Though they emphasise the needs of those making submissions to the International Surface Pressure Databank, the resources also have value for anyone involved in weather data rescue.Data Rescue ForumThis web discussion board is devoted to issues affecting those who are rescuing historical weather data for submission to reanalyses data repositories. It covers topics relevant to preparing and formatting historical weather data to a quality level that makes it useful to the climate research community.WMO NomenclatureA reference list for WMO-recognised weather stationswhich includes metadata such as station number, name and station metadata such as height and longitude and latitude.ISPD submission guidelinesThe guidelines for preparing digitised data for submission to the International Surface Pressure Databank and the metadata that identifies and qualifies each data item. Additional guidelineinformation is also available.Digitised Data To-dateAn extensive list of current data holdings in the International Surface Pressure Databank including station and period covered.ZooniverseThe world’s largest online platform for collaborative volunteer research, Zooniverse brings scientists and citizens together over the web. Scientists upload their data and choose the tasks they want volunteers to do.Weather WizardsA strip chart digitiser created by the International Environmental Rescue Organization (IEDRO). Many old weather observations were done by automatic instruments that recorded their readings as a continuous line on a strip chart. Weather Wizard is software that recovers data from these charts up to 30 times faster than traditional manual methods. Users view a computer image of a chart and then sweep the cursor over the ink traced line on the chart registering the values it maps at whatever time interval is needed. Excel PDF data extractorSome of the best OCR software is still poor at correctly extracting printed columns of data. However, Bytescout’s PDF Extractor SD freeware does a creditable job, presumably because it has been designed to work with columnar, numeric dataAssistance for Document ImagingIn many cases, the precursor step to digitising weather data is the imaging (scanning or photographing) of original data documents. Typically the documents are recording sheets and booklets completed many years ago by weather observers. Imaging the documents is done to lessen the wear and tear on these valuable records, to create a record of provenance and also to capture their contents for future use. In some cases only parts of the documents are digitised. For instance, only pressure and attached thermometer may be captured by those making submissions to the ISPD, while the remaining data are made available to other repositories, primarily the International Surface Temperature Initiative[ISTI] and the Global Precipitation Climatology Centre[GPCC]. If images of the original document are accessible over the web, researchers can in future make reference back to these documents and extract other data of interest such as cloud cover, wind speed, sunshine, etc. Following are some of the freeware software tools used by ACRE’s allied projects in their imaging workflow.IrfanviewProven freeware (30 years in development), this image management solution has extensive capabilities to manage large libraries of images, enhance them in multiple ways including recolour, resize, straighten, crop, watermark, rename, pack metadata, change canvas size, etc. Most features can be used in batch mode allowing thousands of images to have sophisticated operations carried out on them automatically. Rasterstitch (small fee involved)An image stitching tool that stitches together multiple images to recreate a large format document. Traditional stitching tools blend two images together which works well for photographic scenes. However, by working at the pixel level Rasterstitch is geared for images of documents, achieving flawless re-compositions of large format textual, written and drawn material. Canon EOS UtilityThis software, designed for Canon EOS cameras, has a remote shooting capability allowing the camera to be mounted in a copy stand while being fully controlled via a computer screen and mouse. It contains visual aids to ensure that all photographed material is captured level and in a specified position within the image screen.EXIFtoolGUISoftware that enables the bulk creation of a wide variety of metadata for insertion into multiple images. Useful for recording a provenance and audit trail in camera and scanned images.RIOTAn industry standard freeware tool for decreasing the size of images with minimal loss of resolution. Useful for transferring images or displaying them over the web.ByteScout PDF MULTITOOLA useful multifunction freeware program that splits multipage pdf documents into single pdf files, creates single multi-page documents out of individual pdf files, rotates pdf images and converts pdf images to jpg/png, etc. Importantly, of all OCR software, it is probably the best at extracting pdf data tables to a CSV file or an Excel spreadsheet.Scan TailorA freeware program useful for post-processing scanned pages. It performs operations such as page splitting, deskewing, adding/removing borders, and others. It will take raw scans and return pages ready to be printed or assembled into a PDF or DJVU file. It does not include optical character recognition nor assembling multi-page documents.
“Draw on the wealth of experience built up by ACRE’s allied project members …”
“Imaging documents is done to lessen wear and tear and make other data available to researchers…”
Include link to a workflow grafik see “Data acquis and present graphic.ppt” Include a link to a case for metadata tagging video