Wissen über Zeichensätze und -kodierung

Standard Developer Shirt, Lizenz: (CC BY 2.0), Autor: https://www.flickr.com/photos/acidpix/
Standard Developer Shirt, Lizenz: (CC BY 2.0), Autor: https://www.flickr.com/photos/acidpix/

In Forschungsdatenzentren wird oft programmiert. Wer programmiert, entwickelt Software. Joel Spolsky definiert in einem älteren und trotzdem lesenswerten Beitrag ein Mindestmaß an Wissen über Zeichensätze und -kodierung: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Two new Stata packages -useold- and -saveascii- now available on SSC

Thanks to the SSC maintainer  Kit Baum, two new commands are available on SSC: useold and saveascii. Both deal with unicode translation in Stata 14 (or younger).

useold works as an inline replacement for Stata’s regular use command. If the version of the Stata instance executing the command is 14 or younger, then it is checked if unicode translation is necessary and, if yes, unicode translate is executed on a temporary copy of the file before opening it. The default code page of the operating system is assumed as source encoding (which might be wrong and can be overridden via option).

You can install useold with:

ssc install useold

saveascii works as an inline replacement for Stata’s regular saveold command. It implements conversion functions as presented by Alan Riley here on Statalist.If the version of the Stata instance executing the command is 14 or younger, all unicode contents (data labels, variable names, variable labels, value label names and contents, characteristics names and contents) are converted to ASCII before running saveold. The default code page of the operating system is assumed as target encoding (which might be wrong and can be overridden via option).

You can install saveascii with:

ssc install saveascii

Both packages come with help files that contain more details on how to use them.