Statistics

“…the story of Homo sapiens trying to stake a claim on shifting ground, flanked on both sides by beast and machine, pinned between meat and math.” (p.13) No typo in the title, this is truly how this book by Brian Chris...
“…the story of Homo sapiens trying to stake a claim on shifting ground, flanked on both sides by beast and machine, pinned between meat and math.” (p.13) No typo in the title, this is truly how this book by Brian Christian is called. It was kindly sent to me by my friends from BUY and I realised I could still write with my right hand when commenting on the margin. (I also found the most marvellous proof to a major theorem but the margin was just too small…)  “The most human human: What artificial intelligence teaches us about being alive” is about the Turing test, designed to test whether an unknown interlocutor is a human or a machine. And eventually doomed to fail. “The final test, for me, was to give the most uniquely human performance I could in Brighton, to attempt a successful defense against the machines.” (p.15) What I had not realised earlier is that there is a competition every year running this test against a few AIs and a small group of humans, the judges (blindly) giving votes for each entity and selecting as a result the most human computer. And also the most human … human! This competition is called the Loebner Prize and it was taking place in Brighton, this most English of English seaside towns, in 2008 when Brian Christian took part in it (as a human, obviously!). “Though both [sides] have made progress, the `algorithmic’ side of the field [of computer science] has, from Turing on, completely dominated the more `statistical’ side. That is, until recently.” (p.65) I enjoyed the book, much more for the questions it brought out than for the answers it proposed, as the latter sounded unnecessarily conflictual to me, i.e. adopting a “us vs.’em” posture and whining about humanity not fighting hard enough to keep ahead of AIs… I dislike this idea of the AIs being the ennemy and of “humanity lost” the year AIs would fool the judges. While I enjoy the sci’ fi’ literature where this antagonism is exacerbated, from Blade Runner to Hyperion, to Neuromancer, I do not extrapolate those fantasised settings to the real world. For one thing, AIs are designed by humans, so having them winning this test (or winning against chess grand-masters) is a celebration of the human spirit, not a defeat! For another thing, we are talking about a fairly limited aspect of “humanity”, namely the ability to sustain a limited discussion with a set of judges on a restricted number of topics. I would be more worried if a humanoid robot managed to fool me by chatting with me for a whole transatlantic flight. For yet another thing, I do not see how this could reflect on the human race as a whole and indicate that it is regressing in any way. At most, it shows the judges were not trying hard enough (the questions reported in The most human human were not that exciting!) and maybe the human competitors had not intended to be perceived as humans. “Does this suggest, I wonder, that entropy may be fractal?” (p.239) Another issue that irked me in the author’s perspective is that he trained and elaborated a complex strategy to win the prize (sorry for the mini-spoiler: in case you did  not know, Brian did finish as the most human human). I do not know if this worry to appear less human than an AI was genuine or if it provided a convenient canvas for writing the book around the philosophical question of what makes us human(s). But it mostly highlight the artificial nature of the test, namely that  one has to think in advance on the way conversations will be conducted, rather than engage into a genuine conversation with a stranger. This deserves the least human human label, in retrospect! “So even if you’ve never heard of [Shanon entropy] beofre, something in your head intuits [it] every time you open your mouth.” (p.232) The book spend a large amount of text/time on the vict
about 8 hours ago
Since 2011, net migration has been on the decline due to falling numbers of immigrants. What are the other key trends behind these often controversial statistics? • Get the data• More from the Datablog on immigration• More data journalis...
Since 2011, net migration has been on the decline due to falling numbers of immigrants. What are the other key trends behind these often controversial statistics? • Get the data• More from the Datablog on immigration• More data journalism and data visualisations from the GuardianIt's that time of year again: the release of migration statistics. Many will be keen to inspect how close these numbers come to the Conservative's target to reduce net migration to 100,000 by 2015 when they will again face the vote. The latest numbers from the Office for National Statistics show that net migration was 153,000 in the year ending September 2012, compared to 242,000 the previous year.Alan Travis has more on the story here, including this comment from the immigration minister, Mark Harper:The figures show we have cut out abuse while encouraging the brightest and best migrants who contribute to economic growth, with a 5% increase in the number of sponsored student visa applications for our world-class universities, and a 5% increase in the number of visas issued to skilled workers.Latest totalsThough the data for the twelve months to September 2012 is still provisional, it suggests that half a million people immigrated to the UK. This represented a 14% reduction from 581,000 immigrants (or 'inflow' as it's named in the data).Long-term emigration meanwhile is rising - up 2% from 339,000 in the year ending September 2011. Though immigration and emigration have moved in opposite directions over the past year, the changes have not been enough to offset one another meaning that net migration remains a positive value. Reasons for comingA critical piece of information for policymakers - wherever they sit on the political spectrum - is the reason given by those who have decided to enter or leave the UK. Here, trends are just as visible as elsewhere. Formal study has been the most common reason given by those immigrating, followed by a work-related motivation. Almost half (190,000) of long-term migrants state study as their reason for coming, though these individuals often receive less attention than the 62,000 who come to the UK to accompany or join a family member already here.Also, often overlooked is the fact that the majority of those who state work as their reason for coming (175,000) are also able to state that they have a definite job. A smaller fraction, 38%, come to the UK in search of employment. 58% of those leaving the UK cite work as a reason for doing so - of these, 64% have a definite job waiting for them in their destination of choice, the remainder state that they are heading off in search of work. Changes in motivation appear to coincide with the financial crisis - more people leaving the UK cited work as a reason for their decision after 2007. Similarly, 2007 was the first year in which more immigrants cited study rather than work as a reason for coming - a trend which has continued ever since. Citizenship trendsFinally, who exactly is arriving and (probably a less controversial question) where are people leaving the UK going to? Well, the Office for National Statistics summarise these numbers using the following headings:• British• EU• EU 15 (EU countries as constituted between 1 January 1995 and 1 May 2004)• EU 8 (eight Central and Eastern European countries that acceded to the EU on 1 May 2004)• All non-EU - which is comprised of• Old commonwealth• New commonwealth• Other foreignWhen the numbers are broken down by citizenship, some of the most striking trends are to be seen among non-British citizens. For example, net-migration of non-British citizens has fallen by 25% from 303,000 in 2011 to 228,000 in 2012. Net-migration of EU citizens remained more stable, falling by 12% to 66,000 in 2012 compared to the previous year. Non-EU destinations remain slightly more appealing to those leaving the UK - 78,000 headed to EU countries compared to 104,000 leaving to places outside the European Union.Below are biannual and, where avail
about 12 hours ago
(This article was first published on Revolutions, and kindly contributed to R-bloggers) The 7th annual R/Rmetrics Workshop om Computational Finance and Financial Engineering will take place June 30-July 4 in the beatiful alpine...
(This article was first published on Revolutions, and kindly contributed to R-bloggers) The 7th annual R/Rmetrics Workshop om Computational Finance and Financial Engineering will take place June 30-July 4 in the beatiful alpine setting of Lake Thune, Switzerland. This is an intimate workshop limited to around 50 participants, and features tutorials from leading practitioners in finance with R, with a special focus on the Rmetrics suite of R packages. This year's program includes in-depth material from experts in academia and the finance industry, as you can see below: Key Note Speaker: Gunter Loeffler - University of Ulm, Institute of Finance    Tower Building and Stock Market Returns Tutorials: Basics and Fundamentals:Nicolas Polson, University of Chicago, School of Business, USA     Bayesian Inference, Gibbs Sampling and Markov Chain Monte CarloStefano Iacus, University of Milano, Department of Economics and Statistics, Milano, Italy    Quasi Likelihood Inference and Model Selection for Stochastic Differential Equations Modern Portfolio Design:Bernhard Pfaff, Invesco Research Frankfurt, Germany    Portfolio Selection, Optimization and Design with RDiethelm Wuertz, Swiss Federal Institute of Technology, Zurich, Switzerland    Portfolio Diversification and Stability Strategies Advanced Computing in R:Stefan Theussl, Raiffeisen Research and Vienna Univeristy of Economics, Austria    High Performance Computing and Parallel R Building Platforms:Charles Roosen, Zurich Re Insurance, Zurich, Switzerland    Behind the Zurich Re Insurance PlatformWolfgang Breymann, Zurich University of Applied Sciences, Switzerland    The Unified Financial Modeling Platform For more information about the workshop, follow the link below. Rmetrics.org: 2013 Meielisalp Workshop and Summer School  To leave a comment for the author, please follow the link and comment on his blog: Revolutions. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...
about 13 hours ago
The 7th annual R/Rmetrics Workshop om Computational Finance and Financial Engineering will take place June 30-July 4 in the beatiful alpine setting of Lake Thune, Switzerland. This is an intimate workshop limited to around 50 parti...
The 7th annual R/Rmetrics Workshop om Computational Finance and Financial Engineering will take place June 30-July 4 in the beatiful alpine setting of Lake Thune, Switzerland. This is an intimate workshop limited to around 50 participants, and features tutorials from leading practitioners in finance with R, with a special focus on the Rmetrics suite of R packages. This year's program includes in-depth material from experts in academia and the finance industry, as you can see below: Key Note Speaker: Gunter Loeffler - University of Ulm, Institute of Finance    Tower Building and Stock Market Returns Tutorials: Basics and Fundamentals:Nicolas Polson, University of Chicago, School of Business, USA     Bayesian Inference, Gibbs Sampling and Markov Chain Monte CarloStefano Iacus, University of Milano, Department of Economics and Statistics, Milano, Italy    Quasi Likelihood Inference and Model Selection for Stochastic Differential Equations Modern Portfolio Design:Bernhard Pfaff, Invesco Research Frankfurt, Germany    Portfolio Selection, Optimization and Design with RDiethelm Wuertz, Swiss Federal Institute of Technology, Zurich, Switzerland    Portfolio Diversification and Stability Strategies Advanced Computing in R:Stefan Theussl, Raiffeisen Research and Vienna Univeristy of Economics, Austria    High Performance Computing and Parallel R Building Platforms:Charles Roosen, Zurich Re Insurance, Zurich, Switzerland    Behind the Zurich Re Insurance PlatformWolfgang Breymann, Zurich University of Applied Sciences, Switzerland    The Unified Financial Modeling Platform For more information about the workshop, follow the link below. Rmetrics.org: 2013 Meielisalp Workshop and Summer School 
about 13 hours ago
(This article was first published on Revolutions, and kindly contributed to R-bloggers) by Joseph Rickert On May 10th and 11th, in honor of this being the International Year of Statistics, the Milwaukee Chapter of the American ...
(This article was first published on Revolutions, and kindly contributed to R-bloggers) by Joseph Rickert On May 10th and 11th, in honor of this being the International Year of Statistics, the Milwaukee Chapter of the American Statistical Association (MILWASA) held a workshop on cutting edge uses of R in Bioinformatics. One objective of the workshop was to show the "nuts and bolts" details of how R with C++ integration and the specialized capabilities of the Bioconductor Project provides an flexible, feature-rich platform for advanced Bioinfomatics applications. Featured speakers were:  Denise Scholtens who gave talks on analyzing microarray data using R and Bioconductor, building graphs with R and Bioconductor, gene set enrichment analysis, and Expression Set objects. Kwang-Youn Kim who spoke on the analysis of RNA sequencing data using R and Bioconductor and Dirk Eddelbuettel gave a thorough, four-part introduction to Rcpp, his package for integrating R with C++. A tremendous amount of material from this workshop (pdfs, slides, data and R code) is available online. And, if you are interested in R and C++ integration have a look at Dirk’s new book. The following graph from Denise’s presentation on gene set enrichment analysis shows a portion of an induced gene ontology graph using using the classic Fisher elimination algorithm and gives an idea of the some of the sophisticated analyses you can do with her R code. Note if you want to run the code you will have to get some of the packages from Bioconductor. Here is some code from Kwang-Youn on how to get started. # Install all the necessary packages if not on your system source("http://bioconductor.org/biocLite.R") ## Bioconductor version 2.11 (BiocInstaller 1.8.3), ?biocLite for help biocLite(c("TxDb.Dmelanogaster.UCSC.dm3.ensGene", "ShortRead", "edgeR", "cummeRbund")) ## BioC mirror: http://bioconductor.org ## Using Bioconductor version 2.11 (BiocInstaller 1.8.3), R version 2.15. ## Installing package(s) 'TxDb.Dmelanogaster.UCSC.dm3.ensGene' 'ShortRead' ## 'edgeR' 'cummeRbund' ## ## The downloaded binary packages are in ## /var/folders/nk/9bnzzk_152vg4wslcbxc5g_c0000gn/T//RtmpzkrvEW/downloaded_packages Created by Pretty R at inside-R.org And here is some sample code from Dirk's presentation on calling R plot functions from C++.  #include // embedded R via RInside int main(int argc, char *argv[]) { RInside R(argc, argv); // create an embedded R instance // evaluate an R expression with curve() std::string cmd = "tmpf "png(tmpf); curve(x^2, -10, 10, 200); " "dev.off(); tmpf"; // by running parseEval, we get ?lename back std::string tmpfile = R.parseEval(cmd); std::cout "Could use plot in " ; unlink(tmpfile.c_str()); // cleaning up // alternatively, by forcing a display we can plot to screen cmd = "x11(); curve(x^2, -10, 10, 200); Sys.sleep(30);"; R.parseEvalQ(cmd); exit(0); } Revolution Analytics is proud to have been a sponsor for this workshop. Congratulations to Rodney Sparapani of the Medical College of Wisconsin for making it happen! To leave a comment for the author, please follow the link and comment on his blog: Revolutions. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...
about 14 hours ago
by Joseph Rickert On May 10th and 11th, in honor of this being the International Year of Statistics, the Milwaukee Chapter of the American Statistical Association (MILWASA) held a workshop on cutting edge uses of R in Bioinformatics. On...
by Joseph Rickert On May 10th and 11th, in honor of this being the International Year of Statistics, the Milwaukee Chapter of the American Statistical Association (MILWASA) held a workshop on cutting edge uses of R in Bioinformatics. One objective of the workshop was to show the "nuts and bolts" details of how R with C++ integration and the specialized capabilities of the Bioconductor Project provides an flexible, feature-rich platform for advanced Bioinfomatics applications. Featured speakers were:  Denise Scholtens who gave talks on analyzing microarray data using R and Bioconductor, building graphs with R and Bioconductor, gene set enrichment analysis, and Expression Set objects. Kwang-Youn Kim who spoke on the analysis of RNA sequencing data using R and Bioconductor and Dirk Eddelbuettel gave a thorough, four-part introduction to Rcpp, his package for integrating R with C++. A tremendous amount of material from this workshop (pdfs, slides, data and R code) is available online. And, if you are interested in R and C++ integration have a look at Dirk’s new book. The following graph from Denise’s presentation on gene set enrichment analysis shows a portion of an induced gene ontology graph using using the classic Fisher elimination algorithm and gives an idea of the some of the sophisticated analyses you can do with her R code. Note if you want to run the code you will have to get some of the packages from Bioconductor. Here is some code from Kwang-Youn on how to get started. # Install all the necessary packages if not on your system source("http://bioconductor.org/biocLite.R") ## Bioconductor version 2.11 (BiocInstaller 1.8.3), ?biocLite for help biocLite(c("TxDb.Dmelanogaster.UCSC.dm3.ensGene", "ShortRead", "edgeR", "cummeRbund")) ## BioC mirror: http://bioconductor.org ## Using Bioconductor version 2.11 (BiocInstaller 1.8.3), R version 2.15. ## Installing package(s) 'TxDb.Dmelanogaster.UCSC.dm3.ensGene' 'ShortRead' ## 'edgeR' 'cummeRbund' ## ## The downloaded binary packages are in ## /var/folders/nk/9bnzzk_152vg4wslcbxc5g_c0000gn/T//RtmpzkrvEW/downloaded_packages Created by Pretty R at inside-R.org And here is some sample code from Dirk's presentation on calling R plot functions from C++.  #include // embedded R via RInside int main(int argc, char *argv[]) { RInside R(argc, argv); // create an embedded R instance // evaluate an R expression with curve() std::string cmd = "tmpf "png(tmpf); curve(x^2, -10, 10, 200); " "dev.off(); tmpf"; // by running parseEval, we get ?lename back std::string tmpfile = R.parseEval(cmd); std::cout "Could use plot in " ; unlink(tmpfile.c_str()); // cleaning up // alternatively, by forcing a display we can plot to screen cmd = "x11(); curve(x^2, -10, 10, 200); Sys.sleep(30);"; R.parseEvalQ(cmd); exit(0); } Revolution Analytics is proud to have been a sponsor for this workshop. Congratulations to Rodney Sparapani of the Medical College of Wisconsin for making it happen!
about 14 hours ago
Filed under: Mountains, pictures Tagged: Antarctica, big wall, climbing pictures, dvd, Posing Productions, Queen Maud, Ulvetanna
Filed under: Mountains, pictures Tagged: Antarctica, big wall, climbing pictures, dvd, Posing Productions, Queen Maud, Ulvetanna
about 15 hours ago
(This article was first published on denis haine » R, and kindly contributed to R-bloggers) Next topic from Veterinary Epidemiologic Research: chapter 19, modelling survival data. We start with non-parametric analyses where we m...
(This article was first published on denis haine » R, and kindly contributed to R-bloggers) Next topic from Veterinary Epidemiologic Research: chapter 19, modelling survival data. We start with non-parametric analyses where we make no assumptions about either the distribution of survival times or the functional form of the relationship between a predictor and survival. There are 3 non-parametric methods to describe time-to-event data: actuarial life tables, Kaplan-Meier method, and Nelson-Aalen method. We use data on occurrence of calf pneumonia in calves raised in 2 different housing systems. Calves surviving to 150 days without pneumonia are considered censored at that time. temp Actuarial Life Table To create a life table, we use the function lifetab from package KMsurv, after calculating the number of censored and events at each time point and grouping them by time interval (with gsummary from package nlme). library(KMsurv) interval Kaplan-Meier Method To compute the Kaplan-Meier estimator we use the function survfit from package survival. It takes as argument a Surv object, which gives the time variable and the event of interest. You get the Kaplan-Meier estimate with the summary of the survfit object. We can then plot the estimates to show the Kaplan-Meier survivor function. library(survival) km.sf Kaplan-Meier survivor function (95% CI) Nelson-Aalen Method A “hazard” is the probability of failure at a point in time, given that the calf had survived up to that point in time. A cumulative hazard, the Nelson-Aaalen estimate, can be computed. The Nelson-Aalen estimate can be calculated by transforming the Fleming-Harrington estimate of survival. fh.sf Nelson-Aalen cumulative hazard function (95% CI) Tests of the Overall Survival Curve Several tests are available to test whether the overall survivor functions in 2 or more groups are equal. We can use the log-rank test, the simplest test, assigning equal weight to each time point estimate and equivalent to a standard Mantel-Haenszel test. Also, there’s the Peto-Peto-Prentice test which weights the stratum-specific estimates by the overall survival experience and so reduces the influence of different censoring patterns between groups. To do these tests, we apply the survdiff function to the Surv object. The argument rho gives the weights according to and may be any numeric value. Default is rho = 0 which gives the log-rank test. Rho = 1 gives the “Peto & Peto modi?cation of the Gehan-Wilcoxon test”. Rho larger than zero gives greater weight to the ?rst part of the survival curves. Rho smaller than zero gives weight to the later part of the survival curves. survdiff(Surv(days, pn == 1) ~ stock, data = calf_pneu, rho = 0) # rho is optional Call: survdiff(formula = Surv(days, pn == 1) ~ stock, data = calf_pneu, rho = 0) N Observed Expected (O-E)^2/E (O-E)^2/V stock=batch 12 4 6.89 1.21 2.99 stock=continuous 12 8 5.11 1.63 2.99 Chisq= 3 on 1 degrees of freedom, p= 0.084 survdiff(Surv(days, pn == 1) ~ stock, data = calf_pneu, rho = 1) # rho=1 asks for Peto-Peto test Call: survdiff(formula = Surv(days, pn == 1) ~ stock, data = calf_pneu, rho = 1) N Observed Expected (O-E)^2/E (O-E)^2/V stock=batch 12 2.89 5.25 1.06 3.13 stock=continuous 12 6.41 4.05 1.38 3.13 Chisq= 3.1 on 1 degrees of freedom, p= 0.0766 Finally we can compare survivor function with stock R plot or using ggplot2. With ggplot2, you get the necessary data from the survfit object and create a new data frame from it. The baseline data (time = 0) are not there so you create it yourself: (km.stock K-M survival curves, by stocking type To leave a comment for the author, please follow the link and comment on his blog: denis haine » R. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: v
about 15 hours ago
(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers) A couple of days ago, we had a quick chat on Karl Broman‘s blog, about snakes and ladders (see http://kbroman.wordpres...
(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers) A couple of days ago, we had a quick chat on Karl Broman‘s blog, about snakes and ladders (see http://kbroman.wordpress.com/…) with Karl and Corey (see http://bayesianbiologist.com/….), and the use of Markov Chain. I do believe that this application is truly awesome: the example is understandable by anyone, and computations (almost any kind, from what we’ve tried) are easy to perform. At the same time, some French students asked me specific details regarding some old lectures notes on Markov chains, and on some introductory example I used as a possible motivation: the stepping stone algorithm. In the notes, I just mentioned the idea of this popular generic algorithm (introduced in Sawyer (1976)) and I use simulations to show - visually - how it works. Again, it was just to motivate the course which actually did focus on the theory of Markov Chains. But those student wanted more, like how did I get the transition matrix, for instance. And that is actually not a simple question, from a computational perspective. I mean, I can easily generate this Markov Chain, but writing explicitly the transition, that was another story. Which took me a bit longer. In a very specific case… But let us get back to the roots, and to the stepping stone algorithm. At least, one of them (the one I used in my notes) because it looks like there are several algorithm. We do consider a grid, say , with some colors inside, say  possible colors. Each cell of the grid has a given color. Then, at some stage, we select randomly one cell in the grid, and it will take the color of one of its neighbor (some kind of absorption, or mutation). This is, more or less, what is also detailed in some lecture notes by James Propp (see also e Sato (1983) or Zähle et al. (2005) for more theoretical details about that Markov chain). This is extremely simple to generate (that’s what I did in my notes, with very big grids, and a lot of colors). But what if we want to write the transition matrix ? First of all, we need to define the state space. Basically, we do have  cells, each of them has one color, chosen among . Which gives us  possible states…. And that can be large. I mean, if we consider the smallest possible grid (that might be interesting), say , and only  colors, then we talk about possible states. That is large, not huge. But we should keep in mind that we have to compute a transition matrix, that would be a matrix with  elements. More generally, we talk about writing down matrices with  elements. If we want black and white  grids, that would mean a matrix with  which mean 4 billion elements ! And if we consider an red-green-blue  grid, we have to explicit a matrix with  i.e almost 400 million elements. So, let’s face it: we can only work with  bi-color grids. So let’s try… The good thing is that it can be related to work I’ve been doing recently on binomial recombining trees (binomial being related to bi-color). First of all, our grid will be describes as follows > h=3 > M=matrix(1:(h^2),h,h) > M [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 with two colors > color=c("red","blue") Then, we should look for neighbors, or derive an neighborhood matrix, > d=function(i,j) dist(rbind(c((i-1)%/%h,(i-1)%%h), + c((j-1)%/%h,(j-1)%%h))) > Neighb=matrix(Vectorize(d)(rep(1:(h^2),each=h^2), + rep(1:(h^2),h^2)),h^2,h^2) > trunc(Neighb*100)/100 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 0.00 1.00 2.00 1.00 1.41 2.23 2.00 2.23 2.82 [2,] 1.00 0.00 1.00 1.41 1.00 1.41 2.23 2.00 2.23 [3,] 2.00 1.00 0.00 2.23 1.41 1.00 2.82 2.23 2.00 [4,] 1.00 1.41 2.23 0.00 1.00 2.00 1.00 1.41 2.23 [5,] 1.41 1.00 1.41 1.00 0.00 1.00 1.41 1.00 1.41 [6,] 2.23 1.41 1.00 2.00 1.00 0.00 2.23 1.41 1.00 [7,] 2.00 2.23 2.82 1.
about 16 hours ago
A decade of global data attempts to analyse the details of bullying. But what can the figures really tell us about an issue that is so difficult to record?• Get the data• More data journalism and data visualisations from the GuardianThe ...
A decade of global data attempts to analyse the details of bullying. But what can the figures really tell us about an issue that is so difficult to record?• Get the data• More data journalism and data visualisations from the GuardianThe evolution of social media and mobile communication may have made it easier than ever for young people to share but they also create an environment that can make bullying "inescapable and even more threatening than ever before" according to a new report by Child Helpline International (CHI). CHI, a network of government and civil society organisations, operates 173 child helplines in over 142 countries and in the past 10 years has collated a database of more than 126m contacts by children and adults on behalf of young people from its member helplines. The 126m refers to the number of conversations that have taken place between a child or young person and a counsellor of a child helpline somewhere in the world, on any subject a child or young person wanted to talk about. The database has collated data through any form of communication used by child helplines including telephone, chat, SMS, message boards, walk-in centres and outreach activities. Of the 126m, nearly 4m have been about abuse and violence, including categories such as bullying, emotional abuse, physical abuse, sexual abuse and neglect. And since the CHI started collecting data on cyberbullying in 2011, more than 27,000 contacts have been recorded on this subject.On average, every child helpline in the world receives nine contacts from children and young people per day who are suffering the effects of bullying, according to CHI.Of course the results of this report don't tell the whole story, for every child that seeks advice by contacting a helpline, there are many more that either do not have the access, confidence or privacy to do the same. As a result, gathering data on the number of children suffering from bullying has never been simple, but the CHI's analysis helps give an insight into a global problem affecting many.NSPCC statistics on bullying collated from government reports and research suggest that almost half of children and young people have been bullied at school at some point in their lives. The NSPCC also report that 38% of young people have been affected by cyberbullying.Figures from a 2011 report by the Department of Education (DfE) also show that girls are twice as likely to experience persistent cyberbullying than boys. This trend was also apparent in CHI's analysis - the number of girls contacting them about cyberbullying was slightly higher than boys, although 90% of those contacting child helplines "hesitated to disclose their gender to protect their identity and maintain their anonymity after having suffered online abuse".The overall proportion that were not willing disclose their gender for recording purposes was 71%. Ofcom research, published in 2008, showed that almost half of children aged 8-17 who use the internet had set up their own profile on a social networking site. The Ofcom research also reported the following observation: It also appears likely that when children receive hostile, bullying or hateful messages, they are generally ill-equipped to respond appropriately or to cope with the emotional upset this causesSo what else does the release by CHI show? Well, the number of contacts received in 2012 was more than double those received in 2006, but as CHI note, the rise in contacts could also be associated with growing awareness of bullying amongst children. Bullying can take many forms but analysis on information gathered since 2011 has highlighted four major categories; emotional, physical, exposure and theft. Almost half of the contacts on bullying could be be categorised as emotional bullying and nearly a quarter as physical abuse. Instances where young people have been either exposed to bullying as a witness or have had belongings stolen accounted for 12.5% of contacts each. Emotional bullying
about 16 hours ago