Die Einreichungsfrist für Beiträge zur European DDI User Conference endet am 6. September 2015. Die Konferenz selbst findet am 2./3. Dezember in Kopenhagen statt. Die Konferenzwebseite enthält schon jetzt die wichtigsten Hinweise und den CfP.
Kategorie: Uncategorized
Dictionary with terms for the Research Data Domain
The Consortia Advancing Standards in Research Administration Information (CASRAI) provides a dictionary containing terms for the Research Data Domains. Each term has a unique identifier (UUID) and a URL that can be used as references to enhance reading comprehension of documents by hyperlinking terms to their definition. The URL for each term contains a link to a Discussion page to complete the feedback loop with the community of users.
The Glossary has been developed in consultation with vocabulary experts and practitioners from a wide cross-section of stakeholder groups. It is meant to be a practical reference for individuals and working groups concerned with the improvement of research data management, and as a meeting place for further discussion and development of terms. The aim is to create a stable and sustainably governed glossary of community accepted terms and definitions, and to keep it relevant by maintaining it as a ‘living document’ that is updated when necessary.
Form other sections of the dictionary one can return to this pilot section using the top-menu item Filter by and selecting Research Data Domain. To see all terms in the CASRAI dictionary (including the RDC terms), go here: http://dictionary.casrai.org/Category:Terms
In addition to direct comments on specific terms in the Glossary CASRAI is very interested in receiving feedback about the Glossary in general. Here is a short survey: https://www.surveymonkey.com/r/Glossary_ResearchDataManagement
This section of the dictionary is developed and maintained by Research Data Canada’s (RDC) Standards & Interoperability Committee (http://www.rdc-drc.ca) in collaboration with CASRAI. It is made publicly available under a Creative Commons Attribution Only license (CC-BY).
(via [DDI-users])
Two new Stata packages -useold- and -saveascii- now available on SSC
Thanks to the SSC maintainer Kit Baum, two new commands are available on SSC: useold and saveascii. Both deal with unicode translation in Stata 14 (or younger).
useold works as an inline replacement for Stata’s regular use command. If the version of the Stata instance executing the command is 14 or younger, then it is checked if unicode translation is necessary and, if yes, unicode translate is executed on a temporary copy of the file before opening it. The default code page of the operating system is assumed as source encoding (which might be wrong and can be overridden via option).
You can install useold with:
ssc install useold
saveascii works as an inline replacement for Stata’s regular saveold command. It implements conversion functions as presented by Alan Riley here on Statalist.If the version of the Stata instance executing the command is 14 or younger, all unicode contents (data labels, variable names, variable labels, value label names and contents, characteristics names and contents) are converted to ASCII before running saveold. The default code page of the operating system is assumed as target encoding (which might be wrong and can be overridden via option).
You can install saveascii with:
ssc install saveascii
Both packages come with help files that contain more details on how to use them.
Working with the PASS data: User Guide Examples in SPSS
While the PASS Scientific Use File has been available in SPSS format since wave 1, PASS support documents for SPSS users have not been available so far. With the recent release of a new Quick Start File the PASS team now provides all the worked examples from the PASS User Guide originally done in Stata as SPSS/PASW code. This includes examples for merging household, individual, spell and weight datsets, as well as using the cross-sectional and longitudinal weights for projections to different populations.
PASS Quick Start File – Analysing the PASS data using SPSS/PASW
splitit und combival – two new Stata ados
splitit is for spell data: It splits overlapping spells within a case and leaves the data otherwise untouched. The resulting file has the fewest possible number of split spells satisfying the condition not to overlap, for a given data set.
combival is designed for exploratory and data preparation purposes: It creates variables that combine the levels of a source variable within a defined group of observations. Thus, information that is spread over several observations is compiled and displayed for each observation of the group.
To get more information or install the ado-files, use stata commands
help splitit help combival
or
ssc install splitit ssc install combival
These two ados, that might be useful if you do data management and data analyses, have been published on ssc by Ralf Künster (NEPS, WZB) and Klaudia Erhardt (SOEP, DIW).
Save the date: EDDI2015 am 2./3. Dezember in Kopenhagen
Die 7. Konferenz der europäischen DDI-Nutzer_innen findet am 2. und 3. Dezember in Kopenhagen statt. Die Deadline für die Einreichung von Beiträgen ist der 6. September, der Call-for-papers wird am 22. Mai veröffentlicht. Die Konferenzwebseite enthält schon jetzt die wichtigsten Hinweise.
Big Data: Erstes DataFest Germany in Mannheim
Vom 20. bis 22. März 2015 fand an der Uni Mannheim das erste DataFest in Deutschland statt. Ca. 90 Studenten von verschiedenen Unis aus dem ganzen Bundesgebiet haben drei Tage lang mehrere Gigabytes an Handy-App-Daten mit Stata und R zerlegt. Ziel war es kurze Präsentationen zu erstellen mit denen sie Preise für die beste Erkenntnis, die beste Visualisierung und die beste Verwendung von zusätzlichen Daten gewinnen konnten.
Paper: User-focused threat identification for anonymised microdata
When producing anonymised microdata for research, national statistics institutes (NSIs) identify a number of ‘risk scenarios’ of how intruders might seek to attack a confidential dataset. Hans-Peter Hafner, Felix Ritchie and Rainer Lenz argue in their paper “User-focused threat identification for anonymised microdata” (PDF) that the strategy used to identify confidentiality protection measures can be seriously misguided, mainly since scenarios focus on data protection without sufficient reference to other aspects of data. This paper brings together a number of findings to see how the above problem can be addressed in a practical context. Using as an example the creation of a scientific use file, the paper demonstrates that an alternative perspective can have dramatically different outcomes. (Source: Authors’ abstract)
SPSS pitfalls: Combining files with custom variable attributes
Adding custom variable-attributes is a useful feature of SPSS available since version 14 of 2005. It can be used to assign additional information to variables and store it with the survey data, e.g. metadata or paradata. However, compared to the attributes reserved by SPSS (like variable labels or value labels), user-defined attributes demand extra attention and there are some pitfalls to look out for. SPSS pitfalls: Combining files with custom variable attributes weiterlesen
Datenaufbereitung: Querschnitts- und Episodendaten zusammenführen
Im Rahmen von Befragungen werden oft Informationen retrospektiv über Zeiträume erhoben: z. B. in Kalendarien oder verschleiften Historien. Die Daten werden dann möglicherweise als Spell- oder Episodendatensatz getrennt von den Querschnittsdaten abgelegt. Für das Panel “Arbeitsmarkt und soziale Sicherungen” (PASS) gibt es jetzt ein Papier das beispielhaft erklärt, wie man in solchen Fällen die Querschnitts- und Episodendaten zusammenführen kann:
PASS Quick Start File – Spellinformationen im Querschnitt
Auch wenn im Beispiel mit PASS-Daten gearbeitet wird ist das Stata-do-File hoffentlich hinreichend allgemein dokumentiert um auch für andere Studien nützlich zu sein. Feedback willkommmen!