Statistics

How bad is the deficit really? We bring you all the data going back to the 1940s• Get the dataHow bad is Britain's deficit? The latest set of figures show that Britain's deficit was £2.5bn lower in April than the same month a year earli...
How bad is the deficit really? We bring you all the data going back to the 1940s• Get the dataHow bad is Britain's deficit? The latest set of figures show that Britain's deficit was £2.5bn lower in April than the same month a year earlier.The Office for National Statistics said public sector net borrowing came in at £85.1 for the 2012-13 financial year. That's a £35.8bn improvement on the £120.9bn in the previous year..Heather Stewart writes today: George Osborne received a boost on Wednesday with news that the deficit was £2.5bn lower in April than the same month a year earlier, boosting hopes that his plan to repair the UK's public finances is back on track.We have the complete set of data on Government borrowing, all the way back to the 1940s. All political parties have faced their fair share of debt through the years - almost as if the economic climate has its own life independent of who is managing it. UK public debtWhat is the deficit? When the ONS talks about the deficit, they take a simple measure - the gap between what's coming into the government in taxes and receipts versus what's being spent. Most commentators look at net borrowing as the deficit figure, because it includes investment spending. It's different to the national debt - which is the total the country owes.So last month the budget was in deficit. Here are the key facts for November - if you exclude the temporary effects of the financial interventions in the banks:• Public sector current budget deficit was £5.6bn in April 2013; this is a £2.5bn lower deficit than in April 2012, when there was a deficit of £8.2bn. • Public sector net borrowing (PSNB ex) was £6.3bn in April 2013; this is £25.4bn higher net borrowing than in April 2012, when net borrowing was £-19.1bn. • For 2012/13, public sector net borrowing (PSNB ex) was £85.1bn; this is £35.8bn lower net borrowing than in 2011/12, when net borrowing was £120.9bn. • For 2012/13, central government net cash requirement was £109.7bn; this is £16.8bn lower net cash requirement than in 2011/12, when net borrowing was £126.5bn. • In 2012/13, public sector net borrowing and public sector current budget deficit are reduced by £6.4bn as a result of cash transfers from the Bank of England Asset Purchase Facility Fund to Government. • In 2012/13, public sector net borrowing and public sector net investment are reduced by £28.0bn as a result of the transfer of the Royal Mail Pension Plan in April 2012. • After removing the effects of the transfer of the Royal Mail Pension Plan and the transfers from the Bank of England Asset Purchase Facility the first 2012/13 estimate of public sector net borrowing is similar in level to last year's borrowing at £119.5bn, £1.4bn lower net borrowing than in 2011/12. • Public sector net debt was £1,185.3bn at the end of March 2013, equivalent to 75.2% of gross domestic product (GDP).The ONS data below shows monthly, quarterly and annual debt and deficit - what can you do with it?Download the data• DATA: download the full list as a spreadsheetMore dataMore data journalism and data visualisations from the GuardianWorld government data• Search the world's government data with our gatewayDevelopment and aid data• Search the world's global development data with our gatewayCan you do something with this data?• Flickr Please post your visualisations and mash-ups on our Flickr group• Contact us at data@guardian.co.uk• Get the A-Z of data• More at the Datastore directory• Follow us on Twitter• Like us on FacebookGeneral election 2010Economic policyEconomic growth (GDP)Tax and spendingPublic financeOffice for National StatisticsGovernment borrowingLiberal-Conservative coalitionAlistair DarlingSpending review 2010Office for Budget ResponsibilityJulia KolleweSimon Rogersguardian.co.uk © 2013 Guardian News and Media Limited or its affiliated companies. All rights reserved. | Use of this content is subject to our Terms & Conditions | More Feeds
about 1 hour ago
I’m looking for small consulting projects to fill the gaps between larger projects. I’m available for projects that would take up to a few days. I can’t take on another large project right now. However, if your company ...
I’m looking for small consulting projects to fill the gaps between larger projects. I’m available for projects that would take up to a few days. I can’t take on another large project right now. However, if your company takes several weeks to initiate a project, we could start the process now and I may be available by the time the paperwork is done. If you have a project you’d like to discuss, please let me know.
about 7 hours ago
The Rendition Project has spent three years creating an interactive guide to CIA rendition flights of terrorist suspects, containing more than 11,000 rows of dataJames Ball
The Rendition Project has spent three years creating an interactive guide to CIA rendition flights of terrorist suspects, containing more than 11,000 rows of dataJames Ball
about 7 hours ago
The Rendition Project, a collaboration between UK academics and the NGO Reprieve, has produced one of the most detailed and illuminating research projects shedding light on the CIA's extraordinary rendition project to date. Here's how to...
The Rendition Project, a collaboration between UK academics and the NGO Reprieve, has produced one of the most detailed and illuminating research projects shedding light on the CIA's extraordinary rendition project to date. Here's how to use it.• See The Rendition Project interactive hereThe Rendition Project, run by UK academics, has collaborated with the NGO Reprieve to produce one of the most detailedand illuminating research projects shedding light on the CIA's extraordinary rendition project to date.In a single interactive graphic, it shows in great detail the data behind every confirmed and suspected rendition flight, and then – as it's also intended as a tool to fuel further research and digging – a huge number of other flights of the planes linked to rendition. In total, the data powering the graphic runs to more than 11,000 lines.Of course, that means that the graphic's complex, and so we've provided a guide on how to read and interpret it below. A key caveat is that not every flight contained within the interactive is tied to rendition: some are suspected rendition flights, others are simply flights from planes with tail numbers that were used on suspected rendition flights.It's also important to note that just because a particular company owned or operating a plane believed to have been involved in rendition, it does not necessarily follow that the company itself was involved or even aware of those activities. In some cases, it's unclear whether the airline companies would have been aware of the purpose of the flights.A wealth of supporting data and research – including original documents – has been published directly on The Rendition Project's website.Now, here's how to get the most from the interactive:Picking what to look atBy default, the graphic shows a huge tangle of different flight routes – it's displaying information on the 1,500 or so flights marked as significant within the data: the ones with some suspected involvement in rendition (those doing advanced research can toggle this off using the "key circuits only" drop-down menu).The graphic's easiest to use if this is narrowed down. The graphic is broken down into "circuits" of flights: a full trip made up of several different legs. The screenshot used to illustrate this post represents a round-the-world circuit made up of a number of different airport-to-airport trips.Circuits can include original journeys from America, R&R stops in the Caribbean, refuelling stops, and the central rendition journeys themselves.The menu on the left-hand side of the graphic gives a range of ways the information can be narrowed down: trips which only take in certain airports can be picked, or particular companies, or particular individuals known to have been targets of rendition. The date range can also be selected using the sliding toggles.Hitting the large "SEARCH" button at the bottom-left will then update the map with the new settings.What the different colours meanDifferent individual flights are colour-coded by their significance.The simplest flights are marked in grey. These are legs of the flights where the researchers had no reason to believe there was any detainee aboard the aircraft. These mark refuelling stops, planes getting into position, R&R stops, or similar.At the other end of the scale, strong red lines mark a flight designated a "rendition flight". These are flights where the researchers are as near as possible as investigators on these topics can be to certain that a detainee – often a named detainee – was aboard the plane. These are backed by a wealth of evidence.Paler red lines mark "highly suspicious" or "suspicious" flights – one where there is evidence – often strong – to believe a detainee was aboard a given flight, but where the researchers are not quite so confident. Some of the "suspicious" flights have been flagged because of very similar routes or timings to flights tied to rendition, rather than specific evidence on that particular flight.The
about 7 hours ago
(This article was first published on Milano R net, and kindly contributed to R-bloggers) Nowadays, routinary operations on files, such as renaming or copying, are performed with some mouse clicks. Sometimes, it is useful perform...
(This article was first published on Milano R net, and kindly contributed to R-bloggers) Nowadays, routinary operations on files, such as renaming or copying, are performed with some mouse clicks. Sometimes, it is useful perform this operations in batch. Linux users perform this operations through the shell. Also Windows users can use the shell, but there are also a lot of utilities that simplify these operations. Why someone should use R to copy or rename a (lot of) file(s)? For an R user, R can be more intuitive than the operating system shell. I found another good reason to use R for this operations: I need to operating on files as a preliminary step to my statistical analyses. I received a lot of files (about 20000). Files were contained in a lot of directory structured like follow. Each directory refers to a day and contains some useless file, that I ignored, and a subdirectory with the txt files I need. The main directory has a name like "2012_09_21_Fri" while subdirectory has a name like "Fri 21 sep 2012". So, I need to copy the relevant files in a directory like "2012-09-21". The first step is listing all directories I have. I saved both the full path and only the name of each directory in two different R vectors. ?View Code RSPLUSfl = list.files(dirIn, full.names = TRUE) dn = list.files(dirIn, full.names = FALSE) At this point, in every directory (so I put the code below in a for cycle), I search the subdirectory (it is the first element of the directory) and I list all files contained in the subdirectory. ?View Code RSPLUSdir = list.files(cfl, full.names = TRUE)[[1]] flTxt = list.files(dir, full.names = TRUE) Now, I need to create a new directory with a name like "2012-09-21". As seen above, information about day, month and year are available in the directory name but they are not well structured. So, I can use paste() and substr() function to build the name. Please note, that cdn contain only one element from dn. For example, cdn = cd[index] where index is the counter of the loop. ?View Code RSPLUSsubdirName = paste0(substr(cdn, 11, 14), "-", substr(cdn, 05, 06), "-", substr(cdn, 08, 09)) Now, I can create my directory, using the dir.create() function: ?View Code RSPLUSdir.create(subdirName) Now, I need to copy all the txt files from their old subdirectories to the new directories I created above. ?View Code RSPLUSfile.copy(from = flTxt, to = subdirName) Finally, also txt files name are difficult to interpret and I need to rename theses files. I list the files in the following way, removing the full path: ?View Code RSPLUSoldNames = list.files(subdirName, full.names = TRUE) oldNames = sapply(strsplit(oldNames, "/"), "[", 7) And now, I can rename my files. newNames is a character vector containing the new file names. newNames is built similarly to subdirName. ?View Code RSPLUSfile.rename(from = file.path(subdirName, oldNames), to = file.path(subdirName, newNames)) To leave a comment for the author, please follow the link and comment on his blog: Milano R net. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...
about 9 hours ago
(This article was first published on eKonometrics, and kindly contributed to R-bloggers) The recent elections in Pakistan on May 11 were a great success by all means. In spite of the threats for violence by Al-Qaeda and its loca...
(This article was first published on eKonometrics, and kindly contributed to R-bloggers) The recent elections in Pakistan on May 11 were a great success by all means. In spite of the threats for violence by Al-Qaeda and its local franchises in Pakistan against those who would vote, millions of Pakistanis indeed stepped out to vote for an elected government. The Election Commission of Pakistan (ECP) claimed a voter turnout of 60%. One would have hoped to see 50.5 million votes polled for a 60% turnout by the 84.2 million registered voters in the 262 ridings of the National Assembly for which the ECP reported results. However, ECP’s own data reported 44.9 million votes, resulting in a gap of app. 5.7 million votes. The actual turnout thus was close to 53%. I used R to siphon off data for 262 ridings, which ECP reported on separate web pages. The R code is presented below. library(XML) # Get the URL prefix u1http://www.ecp.gov.pk/electionresult/Search.aspx?constituency=NA&constituencyid=NA-" # loop through the 272 ridings for (i in 1:272) {     #get the riding number   u2    #complete the URL Address   url2=paste(u1,u2,sep="")     #Read the table   ridedata=readHTMLTable(url2, header=T, which=8,stringsAsFactors=F)     #Read the HTML page   web_page     # Pull out the appropriate line with the riding name using the identifier "specialheading"   ridename     #get the starting integer for the riding name   startx=regexpr("(", ridename, fixed=TRUE)   startx=startx[1]+1     #get the last digit for the riding name   endx=regexpr("  endx=endx[1]-2     #Generate the riding name   ridename=substr(ridename,startx,endx)     #merge data in one table   assign(paste0("fname",u2, sep=""), cbind(ridedata,riding=i,rname=ridename)) } I used a simple rbind command to assemble data in one large file after storing  individual riding data first in separate files. This was done because the server timed out several times during the execution, and it allowed me to restart from the riding where the system failed, rather than starting from the beginning every time. To leave a comment for the author, please follow the link and comment on his blog: eKonometrics. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...
about 10 hours ago
About 35,000 meteorites have been recorded since 2500 BC, and a little over 1,000 of them were seen while they fell, based on data from the Nomenclature Committee of the Meteoritical Society. Carlo Zapponi, a data visualization designer,...
About 35,000 meteorites have been recorded since 2500 BC, and a little over 1,000 of them were seen while they fell, based on data from the Nomenclature Committee of the Meteoritical Society. Carlo Zapponi, a data visualization designer, visualized the latter in Bolides. We saw a mapped version of this data a while back, but Bolides takes a time-based approach. A bar chart shows the number and volume of meteorites that have been seen over time, and on the initial load, you get to watch the meteorites fall, one bright orange fireball at a time.
about 11 hours ago
(This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers) # As a homage to Yitang Zhang who has proven a mind-bending property of Prime Pairs, I have written a prime Sieve to detect a...
(This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers) # As a homage to Yitang Zhang who has proven a mind-bending property of Prime Pairs, I have written a prime Sieve to detect all of the prime numbers from 1 to N.# There might very well be a function in the base package that already does this. No doubt there are a dozen math packages out there which does this. However, it is the first time I have programmed a Prime Sieve :)# A prime sieve is a simple algorithm which grabs the first number after 1 and eliminates all numbers devisible by it. Then it grabs the next number in the set remaining and does the same for that.primes = function(n=1000, printProgress=F) { # 1 is always in the list prime = 1 # The availabe set we look at as greater than 1 up to n set = 2:n # Loop through the set dropping anything which is not a prime and the primes as we get to them as well while (length(set)>0) { # Add the first number we encounter to our prime list prime = c(prime, set[1]) # set = set[floor(set/set[1])!=set/set[1]] if (printProgress) print(paste("Elements Remaining: ",length(set))) } return(prime)}# This works pretty fast.primes1k = primes(printProgress=T) # R finds the primes of the first 1,000 integers takes a little longer# See it mapped outrequire(ggplot2) qplot(primes1k) + geom_histogram(aes(fill = ..count..))# To look at this idea of prime pairs lets, look at theprimes100k = primes(10^5, printProgress=T) qplot(primes100k) + geom_histogram(aes(fill = ..count..)) # Finding the primes of the first 100,000 numbers takes much longer# There is a bit of a stretch near the beginning in which there is between 40,000 and 70,000 elements left in the remaining set in which identification of primes does not eliminate any more than a few elements from the set. After 40,000 elements things start speeding up because the list gets shorter and is able to be scanned faster.primes1m = primes(10^6, printProgress=T) qplot(primes1m) + geom_histogram(aes(fill = ..count..))length(primes1m)# This identifies 78,499 prime numbers between 1 and 1 million.10^6/length(primes1m)# If primes were distributed evenly they would be on average 12.7 numbers apart.# Yitan Zhang proves the astonishing fact that no matter how far we go out no single prime will be further from another prime than by a distance of 70 million. This is true even when pairs of primes might be a great deal further than that from other pairs.# There are no known uses of this theory. However, once again a mathematician has proved something fundamental about numbers which might aid humananity in the distant future. Currently Fermat's little theorem is widely used as the basis for modern cryptography. Perhaps, Yitan Zhang's will be the basis for equally important work in the future. To leave a comment for the author, please follow the link and comment on his blog: Econometrics by Simulation. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...
about 11 hours ago
The OpenData StackExchange site has just launched in beta, and looks to be a great resource for open data sources. Like StackOverflow for programming and CrossValidated for statistics,  OpenData is is a question and answer si...
The OpenData StackExchange site has just launched in beta, and looks to be a great resource for open data sources. Like StackOverflow for programming and CrossValidated for statistics,  OpenData is is a question and answer site for developers and researchers interested in open data. There's no R tag yet (though that would be nice for data sources specifically compatible with R), but there are already some useful tags for government data, APIs, and tools (to name just a few). If you have expertise on open data to share, or just have a question you need an answer to, follow the link below and check it out. StackExchange: Open Data beta (via JM)
about 19 hours ago
(This article was first published on Revolutions, and kindly contributed to R-bloggers) The OpenData StackExchange site has just launched in beta, and looks to be a great resource for open data sources. Like StackOverflow for p...
(This article was first published on Revolutions, and kindly contributed to R-bloggers) The OpenData StackExchange site has just launched in beta, and looks to be a great resource for open data sources. Like StackOverflow for programming and CrossValidated for statistics,  OpenData is is a question and answer site for developers and researchers interested in open data. There's no R tag yet (though that would be nice for data sources specifically compatible with R), but there are already some useful tags for government data, APIs, and tools (to name just a few). If you have expertise on open data to share, or just have a question you need an answer to, follow the link below and check it out. StackExchange: Open Data beta (via JM) To leave a comment for the author, please follow the link and comment on his blog: Revolutions. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...
about 19 hours ago