Randall Munroe

These charts show movie character interactions. The horizontal axis is time. The vertical grouping of the lines indicates which characters are together at a given time.

Image caption: "In the LotR map, up and down correspond LOOSELY to northwest and southeast respectively"

This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License.


New England Journal of Medicine

E. Sidransky, M.A. Nalls, J.O. Aasly, J. Aharon-Peretz, G. Annesi, E.R. Barbosa, A. Bar-Shira, D. Berg, J. Bras, A. Brice, C.-M. Chen, L.N. Clark, C. Condroyer, E.V. De Marco, A. Dürr, M.J. Eblan, S. Fahn, M.J. Farrer, H.-C. Fung, Z. Gan-Or, T. Gasser, R. Gershoni-Baruch, N. Giladi, A. Griffith, T. Gurevich, C. Januario, P. Kropp, A.E. Lang, G.-J. Lee-Chen, S. Lesage, K. Marder, I.F. Mata, A. Mirelman, J. Mitsui, I. Mizuta, G. Nicoletti, C. Oliveira, R. Ottman, A. Orr-Urtreger, L.V. Pereira, A. Quattrone, E. Rogaeva, A. Rolfs, H. Rosenbaum, R. Rozenberg, A. Samii, T. Samaddar, C. Schulte, M. Sharma, A. ...

Recent studies indicate an increased frequency of mutations in the gene encoding glucocerebrosidase (GBA), a deficiency of which causes Gaucher's disease, among patients with Parkinson's disease. We aimed to ascertain the frequency of GBA mutations in an ethnically diverse group of patients with Parkinson's disease.

Genotypes and phenotypic data from a total of 5691 patients with Parkinson's disease (780 Ashkenazi Jews) and 4898 controls (387 Ashkenazi Jews) were analyzed, with multivariate logistic-regression models and the Mantel–Haenszel procedure used to estimate odds ratios across centers.

Data collected from 16 centers demonstrate that there is a strong association between GBA mutations and ...



Leonardo Ariel Saravia, Adonis Giorgi, Fernando Momo

Software used to determine multifractal spectra: Generalized dimensions Dq and spectrum of singularities f(alfa), the software was used in the paper: Saravia LA, Giorgi A, Momo F (2012) Multifractal growth in periphyton communities. Oikos 121: 1810–1820.

mfSBA estimate both spectra and outputs data to evaluate the fits. A more detailed description is in the file mfSBA_README.

mfSBArnz estimate the Dq spectrun and randomize the original image N times to calculate a confidence interval to test the hipothesis that the original distribution was random.

multiSpeciesSBA estimates the multifractal spectra of a 2D species distribution assuming that each position is one individual ...


Proceedings of the National Academy of Sciences of the United States of America (2005)

Jared Tanner, David Donoho Coders: Jared Tanner, David Donoho

This code provides graphs of the neighborliness threshold rho computed by Donoho and Tanner relatively to the Vershick-Sporyshev one. It also presents the behavior of the exponents for the combinatorial prefactor, external and internal angle. For more information, please visit the SparseLab (Seeking Sparse Solutions to Linear Systems of Equations) website (



S. Cobey, M. Lipsitch

Over 90 capsular serotypes of Streptococcus pneumoniae, a common nasopharyngeal colonizer and major cause of pneumonia, bacteremia, and meningitis, are known. It is unclear why some serotypes can persist at all: They are more easily cleared from carriage and compete poorly in vivo. Serotype-specific immune responses, which could promote diversity in principle, are weak enough to allow repeated colonizations by the same type. We show that weak serotype-specific immunity and an acquired response not specific to the capsule can together reproduce observed diversity. Serotype-specific immunity stabilizes competition, and acquired immunity to noncapsular antigens reduces fitness differences. Our model can be ...


Signal Processing (2006)

Michael Elad, David Donoho Coders: Michael Elad, David Donoho

This code analyses graphically the behavior of the basis pursuit (BP) algorithm in presence of noise. The stability conditions are given for a general dictionary as well as for a union of orthonormal matrices. For this, the user must specify the mutual incoherence (M), the signal length (N), the noise-to-signal-ratio (NSR), the maximal number of orthonormal matrices (J) and the normalized NSR, (R). Set M to 1/100 to get the same second set of plots as in the paper. For more information, please visit the SparseLab (Seeking Sparse Solutions to Linear Systems of Equations) website (


Journal of Chemical Information and Modeling

Igor V. Filippov, Marc C. Nicklaus

Until recently most scientific and patent documents dealing with chemistry have described molecular structures either with systematic names or with graphical images of Kekulé structures. The latter method poses inherent problems in the automated processing that is needed when the number of documents ranges in the hundreds of thousands or even millions since graphical representations cannot be directly interpreted by a computer. To recover this structural information, which is otherwise all but lost, we have built an optical structure recognition application based on modern advances in image processing implemented in open source tools, OSRA. OSRA can read documents in over ...

Details (2012)

Aron Jamil Ahmadia, David Ketcheson Coders: Aron Jamil Ahmadia, David Ketcheson

This code reproduces the figures 3a, 3b, 4a, 4b, 6a, 6b and 7b of the article “Optimal stability polynomials for numerical integration of initial value problems” (David I. Ketcheson and Aron J. Ahmadia, 2012). The user can fix the number of stages (s) and the order p, and then retrieve (1) the scaled size of real axis interval inclusion for optimized methods Hopt/s^2 (Table 1, page 12), (2) the scaled size of imaginary axis inclusion for optimized methods Hopt/s (Table 2, page 13), (3) the relative size of largest disk that can be included in the stability region scaled by ...


Economics Letters (2005)

Amélie Charles, Olivier Darné Coders: Amélie Charles, Denisa Banulescu, Elena-Ivona Dumitrescu, Olivier Darné

This code identifies additive outliers (AO) and innovative outliers (IO) in a GARCH(1,1) model. Based on Franses and Ghijsels (1999), it uses the outlier detection method proposed by Chen and Liu (1993). To run this code, input the series of returns and choose the type of outlier to be identified as well as the critical value. The recommended value of the critical value (C=10) is based on simulation experiments proposed by Franses and Dijk (2000).



Oksana Lukjancenko, Martin Christen Thomsen, Mette Voldby Larsen, David Wayne Ussery

PanFunPro stands for PAN-genome analysis based on FUNctional PROfiles. PanFunPro is a tool for pan-genome analysis. It has several following functionalities – (1) homology detection and genome functional characterization by three HMM-collections, (2) pan-/core genome calculation within the set of proteomes, (3) pairwise pan-/core-genome analysis, (4) specific genome calculation for different subsets of genomes as well as pairwise analysis of specific proteomes, (5) basic statistics for the output genes from the pan-/core-/specific-genome calculation, (6) analysis of available GO information for the output genes from the pan-/core-/specific-genome calculation.

PanFunPro is provided in two parts: and


M.O. Finkelstein and B. Levin

These data are crime-related and demographic statistics for 47 US states in 1960. The data were collected from the FBI's Uniform Crime Report and other government agencies to determine how the variable crime rate depends on the other variables measured in the study.

In May, 1978, Brink's Inc. was awarded a contract to collect coins from some 70,000 parking meters in New York City for delivery to the City Department of Finance. Sometime later the City became suspicious that not all of the money collected was being returned to the city. In April of 1978 five Brink's collectors were ...



C. D. Greenman, G. Bignell, A. Butler, S. Edkins, J. Hinton, D. Beare, S. Swamy, T. Santarius, L. Chen, S. Widaa, P. A. Futreal, M. R. Stratton

High-throughput oligonucleotide microarrays are commonly employed to investigate genetic disease, including cancer. The algorithms employed to extract genotypes and copy number variation function optimally for diploid genomes usually associated with inherited disease. However, cancer genomes are aneuploid in nature leading to systematic errors when using these techniques. We introduce a preprocessing transformation and hidden Markov model algorithm bespoke to cancer. This produces genotype classification, specification of regions of loss of heterozygosity, and absolute allelic copy number segmentation. Accurate prediction is demonstrated with a combination of independent experimental techniques. These methods are exemplified with affymetrix genome-wide SNP6.0 data from 755 cancer ...


Journal of Banking and Finance (2012)

Pei Pei, Juan Carlos Escanciano Coders: Pei Pei, Juan Carlos Escanciano

The dataset contains the returns for three portfolios based on three representative US stocks traded on the New York Stock Exchange (NYSE). The stocks are Walt Disney (DIS), General Electric (GE) and Merck & Company (MRK). Daily data on their market closure prices9 are collected over the period of 01/04/1999–12/31/2009, and then the daily returns are calculated as 100 times the difference of the log prices. The compositions of the three portfolios considered are (0.4, 0.1, 0.5), (0.1, 0.1, 0.8) and (0.3, 0.1, 0.6), respectively, where the numbers in each parentheses from left to right represent the portfolio weights on ...



A. Ghosh, W. E. Holt

Delineating the driving forces behind plate motions is important for understanding the processes that have shaped Earth throughout its history. However, the accurate prediction of plate motions, boundary-zone deformation, rigidity, and stresses remains a difficult frontier in numerical modeling. We present a global dynamic model that produces a good fit to such parameters by accounting for lateral viscosity variations in the top 200 kilometers of Earth, together with forces associated with topography and lithosphere structure, as well as coupling with mantle flow. The relative importance of shallow structure versus deeper mantle flow varies over Earth’s surface. Our model reveals where ...



B. J. Shapiro, J. Friedman, O. X. Cordero, S. P. Preheim, S. C. Timberlake, G. Szabo, M. F. Polz, E. J. Alm

Genetic exchange is common among bacteria, but its effect on population diversity during ecological differentiation remains controversial. A fundamental question is whether advantageous mutations lead to selection of clonal genomes or, as in sexual eukaryotes, sweep through populations on their own. Here, we show that in two recently diverged populations of ocean bacteria, ecological differentiation has occurred akin to a sexual mechanism: A few genome regions have swept through subpopulations in a habitat-specific manner, accompanied by gradual separation of gene pools as evidenced by increased habitat specificity of the most recent recombinations. These findings reconcile previous, seemingly contradictory empirical observations ...


Vitamins & Minerals

Mahendra Kumar Trivedi, Alice Branton, Dahryn Trivedi, Gopal Nayak

Magnesium (Mg), present in every cell of all living organisms, is an essential nutrient and primarily responsible for catalytic reaction of over 300 enzymes. The aim of present study was to evaluate the effect of biofield treatment on atomic and physical properties of magnesium powder. Magnesium powder was divided into two parts denoted as control and treatment. Control part was remained as untreated and treatment part received biofield treatment. Both control and treated magnesium samples were characterized using X-ray diffraction (XRD), surface area and particle size analyzer. XRD data showed that biofield treatment has altered the lattice parameter, unit cell ...


SSRN (2012)

Carole Bernard, Zhenyu Cui Coders: Carole Bernard, Zhenyu Cui

The program implements formulas for each proposition of the paper. In particular it computes the fair strike of the discrete variance swap and the continuous variance swap in the Heston and Hull-White model. It also gives asymptotics.


Scientific Reports

Sheng Wang, Jianzhu Ma, Jian Peng, Jinbo Xu

Protein structure alignment is a fundamental problem in computational structure biology. Many programs have been developed for automatic protein structure alignment, but most of them align two protein structures purely based upon geometric similarity without considering evolutionary and functional relationship. As such, these programs may generate structure alignments which are not very biologically meaningful from the evolutionary perspective. This paper presents a novel method DeepAlign for automatic pairwise protein structure alignment. DeepAlign aligns two protein structures using not only spatial proximity of equivalent residues (after rigid-body superposition), but also evolutionary relationship and hydrogen-bonding similarity. Experimental results show that DeepAlign can ...



Sandeep Chakraborty, Ravindra Venkatramani, Basuthkar J. Rao, Bjarni Asgeirsson, Abhaya M. Dandekar

manualescapist.pdf: Manual for ESCAPIST. Manual for using the ESCAPIST program (includes installation instructions) manualproquad.pdf: Manual for PROQUAD. Manual for using the PROQUAD program (includes installation instructions) pd.CB.score.full: Potential difference for CBeta atom. Distribution of Cbeta values in different amino acid pairs, learnt from the PISCES database. perl_scripts: Source code. Contains the perl source code for PROQUAD and ESCAPIST


SSRN Electronic Journal

David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado, Qiji Jim Zhu

Recent computational advances allow investment managers to search for profitable investment strategies. In many instances, that search involves a pseudo-mathematical argument, which is spuriously validated through a simulation of its historical performance (also called backtest).

We prove that high performance is easily achievable after backtesting a relatively small number of alternative strategy configurations, a practice we denote “backtest overfitting”. The higher the number of configurations tried, the greater is the probability that the backtest is overfit. Because financial analysts rarely report the number of configurations tried for a given backtest, investors cannot evaluate the degree of overfitting in most investment ...


Journal of Neuroscience Methods

Jonathan W. Peirce

The vast majority of studies into visual processing are conducted using computer display technology. The current paper describes a new free suite of software tools designed to make this task easier, using the latest advances in hardware and software. PsychoPy is a platform-independent experimental control system written in the Python interpreted language using entirely free libraries. PsychoPy scripts are designed to be extremely easy to read and write, while retaining complete power for the user to customize the stimuli and environment.

Tools are provided within the package to allow everything from stimulus presentation and response collection (from a wide range ...



Handan Wand, Jenny Iversen, Matthew Law, Lisa Maher, Richard H. Barton

Graphical representation of data is one of the most easily comprehended forms of explanation. The current study describes a simple visualization tool which may allow greater understanding of medical and epidemiological data.

We propose a simple tool for visualization of data, known as a “quilt plot”, that provides an alternative to presenting large volumes of data as frequency tables. Data from the Australian Needle and Syringe Program survey are used to illustrate “quilt plots”.

Visualization of large volumes of data using “quilt plots” enhances interpretation of medical and epidemiological data. Such intuitive presentations are particularly useful for the rapid assessment ...


Statistical Analysis and Data Mining

Hemant Ishwaran, Udaya B. Kogalur, Xi Chen, Andy J. Minn

Minimal depth is a dimensionless order statistic that measures the predictiveness of a variable in a survival tree. It can be used to select variables in high-dimensional problems using Random Survival Forests (RSF), a new extension of Breiman's Random Forests (RF) to survival settings. We review this methodology and demonstrate its use in high-dimensional survival problems using a public domain R-language package randomSurvivalForest. We discuss effective ways to regularize forests and discuss how to properly tune the RF parameters ‘nodesize’ and ‘mtry’. We also introduce new graphical ways of using minimal depth for exploring variable relationships.


Wordpress Blog

Ulrich Kleinwechter, Victor Suarez

This archive contains code and data for a social network analysis of international potato trade that was published at on 08 November, 2012.

It can serve as a reference for understanding the analysis, and as a basis for replication of the results as well as for carrying out more detailed analyses of international potato trade.

The following information and data is included:

  • The R code used for the social network analysis of international potato trade.

  • Data files with bilateral matrices of global trade in fresh, frozen and seed potatoes.

  • Data files with supplementary ...



C. Enzinger, F. Fazekas, P. M. Matthews, S. Ropele, H. Schmidt, S. Smith, R. Schmidt

We assessed the brain parenchymal fraction at baseline and subsequent annual brain volume changes over 6 years for 201 participants (F/M = 96/105; 59.8 ± 5.9 years) in the Austrian Stroke Prevention Study from 1.5-T MRI scans using SIENA (structural image evaluation using normalization of atrophy)/SIENAX (an adaptation of SIENA for cross-sectional measurement)( Hypertension, cardiac disease, diabetes mellitus, smoking, and regular alcohol intake were present in 64 (31.8%), 60 (29.9%), 5 (2.5%), 70 (39.3%), and 40 (20.7%) subjects, respectively. Plasma levels of fasting glucose (93.7 ± 18.6 mg/dL), glycated hemoglobin A (HbA1c; 5.6 ± 0.7%), total cholesterol (228.3 ± 40.3 ...

