Author Topic: The Age of Big Data Theft by Bilderberg Nazis for Anti-Human Enslavement Grids  (Read 6498 times)

0 Members and 1 Guest are viewing this topic.

Offline Dig

  • All eyes are opened, or opening, to the rights of man.
  • Member
  • *****
  • Posts: 63,090
    • Git Ureself Edumacated
Big data
http://en.wikipedia.org/wiki/Big_data

In information technology, big data[1] consists of datasets that grow so large that they become awkward to work with using on-hand database management tools. Difficulties include capture, storage,[2] search, sharing, analytics,[3] and visualizing. This trend continues because of the benefits of working with larger and larger datasets allowing analysts to "spot business trends, prevent diseases, combat crime."[4] Though a moving target, current limits are on the order of terabytes, exabytes and zettabytes of data.[5] Scientists regularly encounter this problem in meteorology, genomics,[6] connectomics, complex physics simulations,[7] biological and environmental research,[8] Internet search, finance and business informatics. Data sets also grow in size because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, Radio-frequency identification readers, and wireless sensor networks.[9][10] The world’s technological per capita capacity to store information has roughly doubled every 40 months since the 1980s (about every 3 years)[11] and every day 2.5 quintillion bytes of data are created.[12]

One current feature of big data is the difficulty working with it using relational databases and desktop statistics/visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers".[13] The size of "big data" varies depending on the capabilities of the organization managing the set. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."[14]


Definition

Big data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set.

In a 2001 research report[15] and related conference presentations, then META Group (now Gartner) analyst, Doug Laney, defined data growth challenges (and opportunities) as being three-dimensional, i.e. increasing volume (amount of data), velocity (speed of data in/out), and variety (range of data types, sources). Gartner continues to use this model for describing big data.[16]

Examples

Examples include web logs; RFID; sensor networks; social networks; social data (due to the social data revolution), Internet text and documents; Internet search indexing; call detail records; astronomy, atmospheric science, genomics, biogeochemical, biological, and other complex and/or interdisciplinary scientific research; military surveillance; medical records; photography archives; video archives; and large-scale e-commerce.

Technologies

Big data requires exceptional technologies to efficiently process large quantities of data within tolerable elapsed times. Technologies being applied to big data include massively parallel processing (MPP) databases, datamining grids, distributed file systems, distributed databases, cloud computing platforms, the Internet, and scalable storage systems.[citation needed]

Some but not all MPP relational databases have the ability to store and manage petabytes of data. Implicit is the ability to load, monitor, backup, and optimize the use of the large data tables in the RDBMS.[17][18]

The practitioners of big data analytics processes are generally hostile to shared storage[citation needed]. They prefer direct-attached storage (DAS) in its various forms from solid state disk (SSD) to high capacity SATA disk buried inside parallel processing nodes. The perception of shared storage architectures—SAN and NAS—is that they are relatively slow, complex, and above all, expensive. These qualities are not consistent with big data analytics systems that thrive on system performance, commodity infrastructure, and low cost.

Real or near-real time information delivery is one of the defining characteristics of big data analytics. Latency is therefore avoided whenever and wherever possible. Data in memory is good. Data on spinning disk at the other end of a FC SAN connection is not. But perhaps worse than anything else, the cost of a SAN at the scale needed for analytics applications is thought to be prohibitive.

There is a case to be made for shared storage in big data analytics. But storage vendors and the storage community in general have yet to make that case to big data analytics practitioners.[19]

Impact

When the Sloan Digital Sky Survey (SDSS) began collecting data in 2000, it amassed more in its first few weeks than all data collected in the history of astronomy. Continuing at a rate of about 200 GB per night, SDSS has amassed more than 140 terabytes of information. When the Large Synoptic Survey Telescope, successor to SDSS, comes online in 2016 it is anticipated to acquire that amount of data every five days.[20] In total, the four main detectors at the Large Hadron Collider (LHC) produced 13 petabytes of data in 2010 (13,000 terabytes).[21]

More Big Data impacts:

Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data - the equivalent of 167 times the information contained in all the books in the US Library of Congress.

Facebook handles 40 billion photos from its user base.

Decoding the human genome originally took 10 years to process; now it can be achieved in one week.[20]

The impact of “big data” has increased the demand of information management specialists in that Oracle, IBM, Microsoft, and SAP have spent more than $15 billion on software firms only specializing in data management and analytics. This industry on its own is worth more than $100 billion and growing at almost 10% a year which is roughly twice as fast as the software business as a whole.[20]

Big data has emerged because we are living in a society which makes increasing use of data intensive technologies. There are 4.6 billion mobile-phone subscriptions worldwide and there are between 1 billion and 2 billion people accessing the internet. Basically, there are more people interacting with data or information than ever before.[20] Between 1990 and 2005, more than 1 billion people worldwide entered the middle class which means more and more people who gain money will become more literate which in turn leads to information growth. The world's effective capacity to exchange information through telecommunication networks was 281 petabytes in 1986, 471 petabytes in 1993, 2.2 exabytes in 2000, 65 exabytes in 2007[11] and it is predicted that the amount of traffic flowing over the internet will reach 667 exabytes annually by 2013.[20]

Critique

Danah Boyd has raised concerns about the use of big data in science neglecting principles such as choosing a representative sample by being too concerned about actually handling the huge amounts of data.[22] This approach may lead to results biased in one way or another. Integration across heterogeneous data resources - some that might be considered “big data” and others not - presents formidable logistical as well as analytical challenges, but many researchers argue that such integrations are likely to represent the most promising new frontiers in science.[23] Broader critiques have also been levelled at Chris Anderson's assertion that big data will spell the end of theory: focusing in particular on the notion that big data will always need to be contextualized in their social, economic and political contexts.[24] Even as companies invest eight- and nine-figure sums to derive insight from information streaming in from suppliers and customers, less than 40% of employees have sufficiently mature processes and skills to do so. To overcome this insight deficit, “big data,” no matter how comprehensive or well analyzed, needs to be complemented by “big judgment.”[25]

See also

Cloud computing
Data assimilation
Database theory
Database-centric architecture
Data Intensive Computing
Data structure
Object database
Online database
Real-time database
Relational database
Social data revolution
Supercomputer
Tuple space

Architecture comparison

Survey Distributed Databases
Marin Dimitrov's Comparison on PNUTS, Dynamo, Voldemort, BigTable, HBase, Cassandra and CouchDB May 2010
Why Use HBase-1: from Million Mark to Billion Mark
Why Use HBase-2: Demystifying HBase Data integrity, Availability and Performance
Beyond Hadoop: Next-Generation Big Data Architectures by By Bill McColl 23 October 2010 about "Not Only Hadoop".
MPI and BSP See wiki about Bulk Synchronous Parallel and Apache HAMA on Hadoop cluster.

Performance evaluation

Existing work done by community
2010:Yahoo Cloud Serving Benchmark(YCSB)
2010:HBase - non SQL Database, Performances Evaluation

References
^ White, Tom. Hadoop: The Definitive Guide. 2009. 1st Edition. O'Reilly Media. Pg 3.
^ Kusnetzky, Dan. What is "Big Data?". ZDNet. http://blogs.zdnet.com/virtualization/?p=1708
^ Vance, Ashley. Start-Up Goes After Big Data With Hadoop Helper. New York Times Blog. 22 April 2010. http://bits.blogs.nytimes.com/2010/04/22/start-up-goes-after-big-data-with-hadoop-helper/?dbk
^ Cukier, K. (25 February 2010). Data, data everywhere. The Economist. http://www.economist.com/specialreports/displaystory.cfm?story_id=15557443
^ Horowitz, Mark. Visualizing Big Data: Bar Charts for Words. Wired Magazine. Vol 16 (7). 23 June 2008. http://www.wired.com/science/discoveries/magazine/16-07/pb_visualizing##ixzz0llT2DN5j. Volu 16(7)
^ Community cleverness required. Nature, 455(7209), 1. 2008. http://www.nature.com/nature/journal/v455/n7209/full/455001a.html
^ Sandia sees data management challenges spiral. HPC Projects. 4 August 2009. http://www.hpcprojects.com/news/news_story.php?news_id=922
^ Reichman,O.J., Jones, M.B., and Schildhauer, M.P. 2011. Challenges and Opportunities of Open Data in Ecology. Science 331(6018): 703-705.DOI:10.1126/science.1197962
^ Hellerstein, Joe. Parallel Programming in the Age of Big Data. Gigaom Blog. 9 November 2008. http://gigaom.com/2008/11/09/mapreduce-leads-the-way-for-parallel-programming/
^ Segaran, Toby and Hammerbacher, Jeff. Beautiful Data. 1st Edition. O'Reilly Media. Pg 257.
^ a b "The World’s Technological Capacity to Store, Communicate, and Compute Information", Martin Hilbert and Priscila López (2011), Science (journal), 332(6025), 60-65; free access to the article through here: martinhilbert.net/WorldInfoCapacity.html
^ http://www-01.ibm.com/software/data/bigdata/
^ Jacobs, A. (6 July 2009). The Pathologies of Big Data. ACMQueue. http://queue.acm.org/detail.cfm?id=1563874
^ Magoulas, Roger., Lorica, Ben. (Feb 2009) Introduction to Big Data. Release 2.0. Issue 11. Sebastopol, CA: O’Reilly Media. http://radar.oreilly.com/r2/release2-0-11.html
^ Douglas, Laney. "3D Data Management: Controlling Data Volume, Velocity and Variety". Gartner. http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf. Retrieved 6 February 2001.
^ Beyer, Mark. "Gartner Says Solving 'Big Data' Challenge Involves More Than Just Managing Volumes of Data". Gartner. http://www.gartner.com/it/page.jsp?id=1731916. Retrieved 13 July 2011.
^ Monash, Curt eBay’s two enormous data warehouses, 30 April 2009 http://www.dbms2.com/2009/04/30/ebays-two-enormous-data-warehouses/
^ Monash, Curt eBay followup — Greenplum out, Teradata > 10 petabytes, Hadoop has some value, and more, 6 October 2010 http://www.dbms2.com/2010/10/06/ebay-followup-greenplum-out-teradata-10-petabytes-hadoop-has-some-value-and-more/
^ How New Analytic Systems will Impact Storage Sept, 2011 http://www.evaluatorgroup.com/document/big-data-how-new-analytic-systems-will-impact-storage-2/
^ a b c d e http://www.economist.com/node/15557443
^ Geoff Brumfiel (19 January 2011). "High-energy physics: Down the petabyte highway". Nature 469: pp. 282–283. doi:10.1038/469282a. http://www.nature.com/news/2011/110119/full/469282a.html. Retrieved 2 October 2011.
^ Danah Boyd (2010-04-29). "Privacy and Publicity in the Context of Big Data". WWW 2010 conference. http://www.danah.org/papers/talks/2010/WWW2010.html. Retrieved 2011-04-18.
^ Jones MB, Schildhauer MP, Reichman OJ, and Bowers S. 2006. The New Bioinformatics: Integrating Ecological Data from the Gene to the Biosphere. Annual Review of Ecology, Evolution, and Systematics 37(1):519-544
^ Graham M. 2012. Big data and the end of theory?. The Guardian http://www.guardian.co.uk/news/datablog/2012/mar/09/big-data-theory
^ Shah, Horne and Capella. 2012. Good Data Won't Guarantee Good Decisions. Harvard Business Review http://hbr.org/2012/04/good-data-wont-guarantee-good-decisions/ar/1
All eyes are opened, or opening, to the rights of man. The general spread of the light of science has already laid open to every view the palpable truth, that the mass of mankind has not been born with saddles on their backs, nor a favored few booted and spurred, ready to ride them legitimately

Offline Dig

  • All eyes are opened, or opening, to the rights of man.
  • Member
  • *****
  • Posts: 63,090
    • Git Ureself Edumacated
U.S. government commits big R&D money to 'Big Data'
http://www.zdnet.com/blog/btl/us-government-commits-big-r-d-money-to-big-data/72760
By Jason Hiner | March 29, 2012, 12:50pm PDT

Summary: The U.S. government is investing $200 million in big data projects to help the U.S. jump ahead in the next frontier of computing.

Calling it one of the most important public investments in technology since the rise of supercomputing and the Internet, six U.S. federal agencies teamed up on Thursday to announce a new $200 million investment “to greatly improve the tools and techniques needed to access, organize, and glean discoveries from huge volumes of digital data” said the White House Office of Science and Technology Policy (OSTP).

This is a pure research and development initiative that will manifest itself as public/private partnerships and new projects that will drive big data investments in government, education, and business. The catalyst for this move came from the President’s Council of Advisors on Science and Technology, which recommended investing more in big data in order to deal with some the biggest challenges in the U.S., including issues in health care, energy, and defense.

The agencies involved in the announcement were:

National Institues of Health
Department of Energy
National Science Foundation
Department of Defense
DARPA
U.S. Geological Survey

The most obvious agencies missing were NASA and the National Oceanic and Atmospheric Administration, which both gather massive amounts of data that could benefit from big data tools and serve the new cause. White House officials said that both agencies will eventually be involved.

The OSTP’s stated goals for the $200 million will be:

Advance state-of-the-art core technologies needed to collect, store, preserve, manage, analyze, and share huge quantities of data.
Harness these technologies to accelerate the pace of discovery in science and engineering, strengthen our national security, and transform teaching and learning; and
Expand the workforce needed to develop and use Big Data technologies.

Confusingly, the Department of Defense also stated that it is “placing a big bet on big data.” It stated that it will invest $250 million annually in big data projects across various departments of the military. The DoD said that it wants to “harness and utilize massive data in new ways and bring together sensing, perception and decision support to make truly autonomous systems that can maneuver and make decisions on their own.” That’s the kind of thing that will scare some people because it sounds like robots and drones that are going to become smart enough to make their own decisions. It’s the stuff science fiction writers have been anticipating, and in some cases fearing, for over half a century. It’s unclear whether the DoD’s $250 annual million investment is separate from the overall $200 million R&D that the OSTP announced.

John Holdren, Director of the White House Office of Science and Technology Policy, said, “In the same way that past federal investments in information technology R&D led to dramatic advances in supercomputing and the creation of the Internet, the initiative we are launching today promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security.”

In the joint press conference, the agency chiefs not only threw out flowery hyperbole about the potential impact of big data in helping solve some of the most important problems that the U.S. is facing in the years ahead, but there were also bureaucrats and technologists who clearly have a deep understanding of the engineering and computer science behind big data and were very enthusiastic about this investment. They were confident that it’s going to enable the U.S. to take a big step forward in one of the next frontiers of computing.
All eyes are opened, or opening, to the rights of man. The general spread of the light of science has already laid open to every view the palpable truth, that the mass of mankind has not been born with saddles on their backs, nor a favored few booted and spurred, ready to ride them legitimately

Offline Dig

  • All eyes are opened, or opening, to the rights of man.
  • Member
  • *****
  • Posts: 63,090
    • Git Ureself Edumacated
DAM Lowdown: Age of Big Data, Metadata for Government, DAM Product Updates
http://www.cmswire.com/cms/digital-asset-management/dam-lowdown-age-of-big-data-metadata-for-government-dam-product-updates-014514.php
By Rikki Endsley (@rikkiends)   Feb 14, 2012   

Last weekend, The New York Times ran an interesting article about big data’s impact on the world. If you didn’t already think big data was cool, you will after reading this article. We also look at recent product updates and the important role metadata plays in the government.
Age of Big Data

On February 11, The New York Times ran a feature article called The Age of Big Data. “Welcome to the Age of Big Data,” the article says. “The new megarich of Silicon Valley, first at Google and now Facebook, are masters at harnessing the data of the Web — online searches, posts and messages — with Internet advertising.”

Not only does the article explain the concept of big data, it also examines the tools for harnessing data and the predictive potential it holds. For example, the article explains, “Researchers have found a spike in Google search requests for terms like 'flu symptoms' and “'lu treatments' a couple of weeks before there is an increase in flu patients coming to hospital emergency rooms in a region (and emergency room reports usually lag behind visits by two weeks or so).”
CoreSource for SAGE

SAGE, an independent academic publisher of journals, books and electronic media, chose Ingram Content Group's CoreSource for the distribution and management of its e-book and print-book files. CoreSource is an online solution for storing, managing and distributing digital content.

Metadata in the Government

In his commentary, Good metadata means good government, Michael Daconta, former metadata program manager for the Homeland Security Department, explains that the key to metadata design is developing the best descriptive fields that increase the usage value of the data to the end user. “These types of fields can be discovered by asking questions such as ‘How do our users get value out of our data?’ and then walking backward from those answers to the fields that best distinguish the good from the bad (or the proverbial wheat from the chaff),” he says. Daconta is also the author of the Information As Product book.

Dassault Systèmes Acquires Netvibes

French software company Dassault Systèmes acquired Netvibes, a social web analytics platform provider. “If you like Netvibes, you will love the new Netvibes,” says Netvibes CEO Freddy Mini. “Our brand, business, website and team will stay. What will change is that all our products will innovate even faster thanks to our deep relationship with Dassault Systèmes.”

Picturepark 8.2 Announced

Almost a month after announcing version 8.1.9, Picturepark announced the 8.2 release of the company’s DAM solution. According to the announcement, the latest version comes with enhanced support for creative processes and integration into Adobe Creative Suite.
All eyes are opened, or opening, to the rights of man. The general spread of the light of science has already laid open to every view the palpable truth, that the mass of mankind has not been born with saddles on their backs, nor a favored few booted and spurred, ready to ride them legitimately

Offline Dig

  • All eyes are opened, or opening, to the rights of man.
  • Member
  • *****
  • Posts: 63,090
    • Git Ureself Edumacated
Recruiting in the Age of Big Data: A Guide for Recruiters
http://www.recruiter.com/i/recruiting-in-the-age-of-big-data-a-guide-for-recruiters/
Jon Parks March 7, 2012

Big Data is hiring, and three of the fastest growing areas of expertise and job growth are in marketing (data-analytics), finance (quant or quantitative finance) and healthcare (bio-informatics).

Few understand what “Big Data” means – much less what it hopes to accomplish.

Hiring for Big Data

If you what to know what these terms mean to employers, and how they’re accomplished, a few similarities and differences help point the way.

Big Data is about the massive quantity of data (information), and its potential for:

targeting and interpreting consumer behavior for advertising and marketing, also called “data-analytics;”
modeling financial events and expectations, also called “quant-finance or quantitative finance;” and
making sense of drugs and therapies that work for some but prove toxic or dangerous for others in a healthcare context, also called “bio-informatics.”

Big Data for Recruiters

To illustrate the approach to problem solving and how it works, consider the pile of resumes on your desk, or the mass of applicants in your data-base.

Since hiring someone might be the near term objective, we’ll consider that result to be a perfect score. The whole process is complicated, and job applicants can improve their chances with cover letters, resumes, follow-up, and referrals. Lots of things take place during this process.

Big Data helps determine how instrumental (causal) these things are to the objective or goal. Some things may actually be unhelpful and lower the chance of being hired, and some things may not improve the chances at all, or improve them very little.

In the world of Big Data, all of these events are scored against their ability to improve the chance that an applicant will be hired. The math can be very complicated, but the results are never any better than the information and quality of what goes into the process.

Mathematicians Don’t Make Good Recruiters

Computers, like the math jocks that use them, speak their own language, and that language (math) does a poor job describing or expressing an applicant’s likeability, charisma, temperament, or any of the dozens of things recruiters look for before hiring someone.

One thing for certain is that Big Data does not work, nor even begin, before the problem being described can be translated and represented mathematically. Processes must be identified, outcomes scored, events observed, and their impact or usefulness inferred.
All eyes are opened, or opening, to the rights of man. The general spread of the light of science has already laid open to every view the palpable truth, that the mass of mankind has not been born with saddles on their backs, nor a favored few booted and spurred, ready to ride them legitimately

Offline Dig

  • All eyes are opened, or opening, to the rights of man.
  • Member
  • *****
  • Posts: 63,090
    • Git Ureself Edumacated
Kurzweil Accelerating Intelligence newsletter
The Age of Big Data

http://www.kurzweilai.net/the-age-of-big-data

At the World Economic Forum last month in Davos, Switzerland, Big Data was a marquee topic.
A report by the forum, “Big Data, Big Impact,” declared data a new class of economic asset, like currency or gold.
All eyes are opened, or opening, to the rights of man. The general spread of the light of science has already laid open to every view the palpable truth, that the mass of mankind has not been born with saddles on their backs, nor a favored few booted and spurred, ready to ride them legitimately

Offline Dig

  • All eyes are opened, or opening, to the rights of man.
  • Member
  • *****
  • Posts: 63,090
    • Git Ureself Edumacated
Management Education in the Age of Big Data
http://smartdatacollective.com/gilpress/46921/management-education-age-big-data
Posted February 22, 2012 with 278 reads

What if business schools based their entire curriculum on the fundamentals of business analytics?

McKinsey estimates that the demand for “deep analytical positions” in the U.S. will exceed supply by 140,000 to 190,000 positions and that there will be a need for 1.5 million additional ”managers and analysts who can ask the right questions and consume the results of the analysis of big data effectively.”

Why assume that “the analysis of big data” will be some special skill performed by a select group of “managers and analysts”? I think that it’s plausible, even desirable, that in the future all managers will be able to speak the language of business analytics and that it will become an integral part of their job.

This goes to the heart of what a manager’s job is. Managerial hierarchies have always been hierarchies of information flows and a significant part of a manager’s job since the advent of the modern corporation has been to move information up and down the organization. It is a century-plus model that is still dominant in established enterprises and edgy start-ups alike. Dustin Moskovitz recently complained about his experience at Facebook: “At the end of a four-week cycle [of information going up and down the management chain], I would know what was going on the previous month.”

Managerial hierarchies are running out of bandwidth today. They can’t cope with the amount of information moving from one layer to the next. The solution may lie not in getting rid of the layers but in investing the power of analytics in each layer, so what moves from one layer to another is analysis, not information.

Existing management education is based on the old model of management hierarchies. In the age of big data, management education should evolve to reflect the new reality that all information is available to everybody in the organization. What all managers need to do is analysis. But U.S. business schools each year graduate more than 500,000 holders of Bachelor's and Master's degrees with little or no knowledge and understanding of analytics and analytical/scientific thinking.

Management educators should assist aspiring managers by teaching tools and techniques for developing models and testing them, not much different from the tools taught to aspiring scientists. They should show future managers how to be critical and skeptical. Currently, most business students hear about testing a hypothesis only in the required statistics course. They pass the exam and move on.

I’m not calling for business schools to produce even more “quants” or “financial engineers” than they already do. My analogy is science—the study of empirical reality—not the production of “elegant” mathematical models that have nothing to do with reality, their assumptions are never questioned, and “are too complex for you to understand.” Managers, in any industry, should never be in a position where they are told that just like they don’t know how their television works, they don’t need to understand how the analytical models driving their business work.

Business schools should revamp their curriculum with courses and hands-on practice in scientific methods, analytical tools, and data mining/programming. Yes, a little bit of coding will go a long way towards preparing students for management in the 21st-century, as will a general knowledge of statistical modeling tools, statistical programming languages, and relational and non-relational databases. I think it will be best to infuse these new knowledge strains and practical experiences in all the traditional courses—marketing, finance, accounting, operations—rather than have a separate “analytics track.” This will give business schools a new focus on a renewed understanding of what management is all about: using analytical methods to make better decisions.
All eyes are opened, or opening, to the rights of man. The general spread of the light of science has already laid open to every view the palpable truth, that the mass of mankind has not been born with saddles on their backs, nor a favored few booted and spurred, ready to ride them legitimately

Offline Dig

  • All eyes are opened, or opening, to the rights of man.
  • Member
  • *****
  • Posts: 63,090
    • Git Ureself Edumacated
Privacy in the Age of Big Data
http://thehealthcareblog.com/blog/2012/02/13/privacy-in-the-age-of-big-data/
By Omer Tene & Jules Polonetsky

We live in an age of “big data.” Data has become the raw material of production, a new source of immense economic and social value. Advances in data mining and analytics and the massive increase in computing power and data storage capacity have expanded, by orders of magnitude, the scope of information available to businesses, government, and individuals.[1] In addition, the increasing number of people, devices, and sensors that are now connected by digital networks has revolutionized the ability to generate, communicate, share, and access data.[2] Data create enormous value for the global economy, driving innovation, productivity, efficiency, and growth. At the same time, the “data deluge” presents privacy concerns that could stir a regulatory backlash, dampening the data economy and stifling innovation.[3] In order to craft a balance between beneficial uses of data and the protection of individual privacy, policymakers must address some of the most fundamental concepts of privacy law, including the definition of “personally identifiable information,” the role of consent, and the principles of purpose limitation and data minimization.

Big Data: Big Benefits

The uses of big data can be transformative, and the possible uses of the data can be difficult to anticipate at the time of initial collection. For example, the discovery of Vioxx’s adverse effects, which led to its withdrawal from the market, was made possible by the analysis of clinical and cost data collected by Kaiser Permanente, a California-based managed-care consortium. Had Kaiser Permanente not connected these clinical and cost data, researchers might not have been able to attribute 27,000 cardiac arrest deaths occurring between 1999 and 2003 to use of Vioxx.[4] Another oft-cited example is Google Flu Trends, a service that predicts and locates outbreaks of the flu by making use of information—aggregate search queries—not originally collected with this innovative application in mind.[5] Of course, early detection of disease, when followed by rapid response, can reduce the impact of both seasonal and pandemic influenza.


While a significant driver for research and innovation, the health sector is by no means the only arena for transformative data use. Another example is the “smart grid,” which refers to the modernization of the current electrical grid to achieve a bidirectional flow of information and electricity. The smart grid is designed to allow electricity service providers, users, and other third parties to monitor and control electricity use. Some of the benefits accrue directly to consumers, who are able to reduce energy consumption by learning which devices and appliances consume the most energy, or which times of the day put the highest or lowest overall demand on the grid. Other benefits, such as accurately predicting energy demands to optimize renewable sources, are reaped by society at large.

Traffic management and control is another field witnessing significant data-driven environmental innovation. Governments around the world are establishing electronic toll pricing systems, which set forth differentiated payments based on mobility and congestion charges. Users pay depending on their use of vehicles and roads. These and other uses of data for traffic control enable governments to “potentially cut congestion and the emission of pollutants.”[6]

Big data is also transforming the retail market. Indeed, Wal-Mart’s inventory-management system, called Retail Link, pioneered the age of big data by enabling suppliers to see the exact number of their products on every shelf of every store at each precise moment in time. Many of us use Amazon’s “Customers Who Bought This Also Bought” feature, prompting users to consider buying additional items selected by a collaborative filtering tool. Analytics can likewise be used in the offline environment to study customers’ in-store behavior in order to improve store layout, product mix, and shelf positioning.

Big Data: Big Concerns

The harvesting of large data sets and the use of analytics clearly implicate privacy concerns. The tasks of ensuring data security and protecting privacy become harder as information is multiplied and shared ever more widely around the world. Information regarding individuals’ health, location, electricity use, and online activity is exposed to scrutiny, raising concerns about profiling, discrimination, exclusion, and loss of control. Traditionally, organizations used various methods of de-identification (anonymization, pseudonymization, encryption, key-coding, data sharding) to distance data from real identities and allow analysis to proceed while at the same time containing privacy concerns. Over the past few years, however, computer scientists have repeatedly shown that even anonymized data can often be re-identified and attributed to specific individuals.[7] In an influential law review article, Paul Ohm observed that “[r]eidentification science disrupts the privacy policy landscape by undermining the faith that we have placed in anonymization.”[8] The implications for government and businesses can be stark, given that de-identification has become a key component of numerous business models, most notably in the contexts of health data (regarding clinical trials, for example), online behavioral advertising, and cloud computing.

What Data is “Personal?”

We urge caution, however, when drawing conclusions from the re-identification debate. One possible conclusion, apparently supported by Ohm himself, is that all data should be treated as personally identifiable and subjected to the regulatory framework.[9] Yet such a result would create perverse incentives for organizations to abandon de-identification and therefore increase, rather than alleviate, privacy and data security risks.[10] A further pitfall is that with a vastly expanded definition of personally identifiable information, the privacy and data protection framework would become all but unworkable. The current framework, which is difficult enough to comply with and enforce in its existing scope, may well become unmanageable if it extends to any piece of information. Moreover, as Betsy Masiello and Alma Whitten have noted, while [a]nonym[ized] information will always carry some risk of re-identification . . . . [m]any of the most pressing privacy risks . . . exist only if there is certainty in re-identification, that is if the information can be authenticated. As uncertainty is introduced into the re-identification equation, we cannot know that the information truly corresponds to a particular individual; it becomes more anonymous as larger amounts of uncertainty are introduced.[11]

Most importantly, if information that is not ostensibly about individuals comes under full remit of privacy laws based on a possibility of it being linked to an individual at some point in time through some conceivable method, no matter how unlikely to be used, many beneficial uses of data would be severely curtailed. Such an approach presumes that a value judgment has been made in favor of individual control over highly beneficial uses of data, but it is doubtful that such a value choice has consciously been made. Thus, the seemingly technical discussion concerning the scope of information viewed as personally identifiable masks a fundamental normative question. Policymakers should engage with this normative question, consider which activities are socially acceptable, and spell out the default norms accordingly. In doing so, they should assess the value of data uses against potential privacy risks, examine the practicability of obtaining true and informed consent, and keep in mind the enforceability of restrictions on data flows.

Opt-in or Opt-out?

Privacy and data protection laws are premised on individual control over information and on principles such as data minimization and purpose limitation. Yet it is not clear that minimizing information collection is always a practical approach to privacy in the age of big data. The principles of privacy and data protection must be balanced against additional societal values such as public health, national security and law enforcement, environmental protection, and economic efficiency. A coherent framework would be based on a risk matrix, taking into account the value of different uses of data against the potential risks to individual autonomy and privacy. Where the benefits of prospective data use clearly outweigh privacy risks, the legitimacy of processing should be assumed even if individuals decline to consent. For example, web analytics—the measurement, collection, analysis, and reporting of internet data for purposes of understanding and optimizing web usage—creates rich value by ensuring that products and services can be improved to better serve consumers. Privacy risks are minimal, since analytics, if properly implemented, deals with statistical data, typically in de-identified form. Yet requiring online users to opt into analytics would no doubt severely curtail its application and use.

Policymakers must also address the role of consent in the privacy framework.[12] Currently, too many processing activities are premised on individual consent. Yet individuals are ill-placed to make responsible decisions about their personal data given, on the one hand, well-documented cognitive biases, and on the other hand the increasing complexity of the information ecosystem. For example, Alessandro Acquisti and his colleagues have shown that, simply by providing users a feeling of control, businesses encourage the sharing of data, regardless of whether or not a user has actually gained control.[13] Joseph Turow and others have shown that “[w]hen consumers see the term ‘privacy policy,’ they believe that their personal information will be protected in specific ways; in particular, they assume that a website that advertises a privacy policy will not share their personal information.”[14] In reality, however, “this is not the case.”[15] Privacy policies often serve more as liability disclaimers for businesses than as assurances of privacy for consumers.

At the same time, collective action problems may generate a suboptimal equilibrium where individuals fail to opt into societally beneficial data processing in the hope of free riding on the goodwill of their peers. Consider, for example, internet browser crash reports, which very few users opt into, not so much because of real privacy concerns but rather due to a (misplaced) belief that others will do so instead. This phenomenon is evident in other contexts where the difference between opt-in and opt-out regimes is unambiguous, such as organ donation rates. In countries where organ donation is opt-in, donation rates tend to be very low compared to the rates in countries that are culturally similar but have an opt-out regime.[16] Finally, a consent-based regulatory model tends to be regressive, since individuals’ expectations are based on existing perceptions. For example, if Facebook had not proactively launched its News Feed feature in 2006 and had instead waited for users to opt-in, we might not have benefitted from Facebook as we know it today. It was only after data started flowing that users became accustomed to the change.

We do not argue that individuals should never be asked to expressly authorize the use of their information or offered an option to opt out. Certainly, for many types of data collection and use, such as in the contexts of direct marketing, behavioral advertising, third-party data brokering, or location-based services, consent should be solicited or opt-out granted. But an increasing focus on express consent and data minimization, with little appreciation for the value of uses for data, could jeopardize innovation and beneficial societal advances. The question of the legitimacy of data use has always been intended to take into account additional values beyond privacy, as seen in the example of law enforcement, which has traditionally been allotted a degree of freedom to override privacy restrictions.

Conclusion

Privacy advocates and data regulators increasingly decry the era of big data as they observe the growing ubiquity of data collection and the increasingly robust uses of data enabled by powerful processors and unlimited storage. Researchers, businesses, and entrepreneurs vehemently point to concrete or anticipated innovations that may be dependent on the default collection of large data sets. We call for the development of a model where the benefits of data for businesses and researchers are balanced against individual privacy rights. Such a model would help determine whether processing can be justified based on legitimate business interest or only subject to individual consent, and whether consent must be structured as opt-in or opt-out.

References

See, e.g., Kenneth Cukier, Data, Data Everywhere, Economist, Feb. 27, 2010, at 3-5, available at http://www.economist.com/node/15557443.

See, e.g., Omer Tene, Privacy: The New Generations, 1 Int’l Data Privacy Law 15 (2011), available athttp://idpl.oxfordjournals.org/content/1/1/15.full.

Consider, for example, the draft Regulation proposed on January 25, 2012, by the European Commission to replace the 1995 Data Protection Directive. It is poised to significantly increase sanctions, expand the geographical scope of the law, tighten requirements for explicit consent, and introduce a new “right to be forgotten.” See Commission Proposal for a Regulation of the European Parliament and of the Council on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data (General Data Protection Regulation), COM (2012) 11 final (Jan. 25, 2012), available athttp://ec.europa.eu/justice/data-rotection/document/review2012/com_2012_11_en.pdf.

Rita Rubin, How Did Vioxx Debacle Happen?, USA Today, Oct. 12, 2004, at D1, available at http://www.usatoday.com/news/health/2004-10-12-vioxx-cover_x.htm.

SeeGoogle Flu Trends: How Does This Work?, Google, http://www.google.org/flutrends/about/how.html (last visited Jan. 25, 2012). Also consider Google Translate, which provides a free and highly useful statistical machine translation service capable of translating between roughly sixty languages by relying on algorithms and data freely available on the Web. SeeInside Google Translate, Google, http://translate.google.com/about/intl/en_ALL/ (last visited Jan. 25, 2012).

McKinsey Global Inst., Big Data: The Next Frontier for Innovation, Competition, and Productivity 91-92 (2011), available athttp://www.mckinsey.com/Insights/MGI/Research/Technology_and_Innovation/Big_data_The_next_frontier_for_innovation.

This line of research was pioneered by Latanya Sweeney and made accessible to lawyers by Paul Ohm. See Paul Ohm, Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization, 57 UCLA L. Rev. 1701 (2010); Arvind Narayanan & Vitaly Shmatikov, Robust De-anonymization of Large Sparse Datasets, 2008 Proc. of IEEE Symp. on Security & Privacy 111; Latanya Sweeney, Simple Demographics Often Identify People Uniquely 2 (Carnegie Mellon Univ., Data Privacy Working Paper No. 3, 2000).

Ohm, supra note 7, at 1704.

See id. at 1742-43.

Ann Cavoukian & Khaled El Emam, Info. & Privacy Comm’r of Ont., Dispelling the Myths Surrounding De-identification: Anonymization Remains a Strong Tool for Protecting Privacy 7 (2011), available athttp://www.ipc.on.ca/images/Resources/anonymization.pdf.

Betsy Masiello & Alma Whitten, Engineering Privacy in an Age of Information Abundance, 2010 AAAI Spring Symp. Series 119, 122, available athttp://www.aaai.org/ocs/index.php/SSS/SSS10/paper/viewFile/1188/1497; see also Cynthia Dwork, Differential Privacy, 2006 Int’l Colloquium on Automata, Languages and Programming pt. II, at 8, available athttp://www.dbis.informatik.hu-berlin.de/fileadmin/lectures/SS2011/VL_Privacy/Differential_Privacy.pdf (introducing the concept of a privacy-preserving statistical database, which enables users to learn properties of a population while protecting the privacy of individuals).

See, in a different context, Omer Tene & Jules Polonetsky, To Track or ‘Do Not Track’: Advancing Transparency and Individual Control in Online Behavioral Advertising, 13 Minn. J. L. Sci. & Tech. (forthcoming 2012) (arguing that “n the context of online privacy . . . emphasis should be placed less on notice and choice and more on implementing policy decisions with respect to the utility of given business practices and on organizational compliance with fair information principles”). See also Nicklas Lundblad & Betsy Masiello, Opt-in Dystopias, 7 SCRIPTed 155, 155 (2010), http://www.law.ed.ac.uk/ahrc/scripted/ vol7-1/lundblad.asp (contending that while “
  • pt-in appears to be the optimal solution for anyone who believes consumers should have choice and control over their personal data collection… upon closer examination, it becomes clear that opt-in is a rhetorical straw-man that cannot really be implemented by regulatory policies without creating a number of unintended side effects, many of which are suboptimal for individual privacy”).


Laura Brandimarte, Alessandro Acquisti & George Loewenstein, Misplaced Confidences: Privacy and the Control Paradox, (Sept. 2010) (unpublished manuscript), available athttp://www.futureofprivacy.org/wp-content/uploads/2010/09/Misplaced-Confidences-acquisti-FPF.pdf.
Joseph Turow, Chris Jay Hoofnagle, Deirdre K. Mulligan, Nathaniel Good & Jens Grossklags, The Federal Trade Commission and Consumer Privacy in the Coming Decade, 3(3) I/S: J. L. & Pol’y for Info. Soc’y 723, 724 (2007).

Id.

Consider, for example, the donation rates in Sweden (opt-out) and Denmark (opt-in), 85.9% and 4.25% respectively; or in Austria (opt-out) and Germany (opt-in), 99.9% and 12% respectively. Notice, however, that additional factors besides individuals’ willingness to participate, such as funding and regional organization, affect the ultimate conversion rate for organ transplants. Eric J. Johnson & Daniel Goldstein, Do Defaults Save Lives?, 302 Science 1338 (2003).
All eyes are opened, or opening, to the rights of man. The general spread of the light of science has already laid open to every view the palpable truth, that the mass of mankind has not been born with saddles on their backs, nor a favored few booted and spurred, ready to ride them legitimately

Offline Satyagraha

  • Global Moderator
  • Member
  • *****
  • Posts: 8,941
Big data
http://en.wikipedia.org/wiki/Big_data

...

The impact of “big data” has increased the demand of information management specialists in that Oracle, IBM, Microsoft, and SAP have spent more than $15 billion on software firms only specializing in data management and analytics. This industry on its own is worth more than $100 billion and growing at almost 10% a year which is roughly twice as fast as the software business as a whole.[20]



Corporations are making huge investments in tools they can use to manage the massive amount of data they're collecting; all those tweets and facebook pages, phone calls, emails, prescription records, medical histories, school records, temperature readings, seismic readings... the number of hours per day that you have your stove turned on, your lights turned on, how much gas you buy, how much heating oil you use...etc.

Lot's of data - they've coined a word to cover it, "Big Data" -- Big Data for Big Brother.

At least the wiki page included this one voice in all that praise of big data:

Quote

Danah Boyd has raised concerns about the use of big data in science neglecting principles such as choosing a representative sample by being too concerned about actually handling the huge amounts of data.[22] This approach may lead to results biased in one way or another. Integration across heterogeneous data resources - some that might be considered “big data” and others not - presents formidable logistical as well as analytical challenges, but many researchers argue that such integrations are likely to represent the most promising new frontiers in science.[23] Broader critiques have also been levelled at Chris Anderson's assertion that big data will spell the end of theory: focusing in particular on the notion that big data will always need to be contextualized in their social, economic and political contexts.[24] Even as companies invest eight- and nine-figure sums to derive insight from information streaming in from suppliers and customers, less than 40% of employees have sufficiently mature processes and skills to do so. To overcome this insight deficit, “big data,” no matter how comprehensive or well analyzed, needs to be complemented by “big judgment.”[25]


"big judgment" -- and, since "less than 40% of employees have sufficiently mature processes and skills", we can assume that big judgment must come from a select group of people, perhaps specially-trained...

How did they decide that the number was "less than 40%?
Was there a special test?
Who got to take the test?
This "less than 40%" of employees number is something pulled out of someone's arse.
And  the King shall answer and say unto them, Verily I say unto you, 
Inasmuch as ye have done it unto one of the least of these my brethren,  ye have done it unto me.

Matthew 25:40

Offline Satyagraha

  • Global Moderator
  • Member
  • *****
  • Posts: 8,941
Management Education in the Age of Big Data
http://smartdatacollective.com/gilpress/46921/management-education-age-big-data
Posted February 22, 2012 with 278 reads

What if business schools based their entire curriculum on the fundamentals of business analytics?

McKinsey estimates that the demand for “deep analytical positions” in the U.S. will exceed supply by 140,000 to 190,000 positions and that there will be a need for 1.5 million additional ”managers and analysts who can ask the right questions and consume the results of the analysis of big data effectively.”

Why assume that “the analysis of big data” will be some special skill performed by a select group of “managers and analysts”? I think that it’s plausible, even desirable, that in the future all managers will be able to speak the language of business analytics and that it will become an integral part of their job.


Now that the tools for collecting massive amounts of data are in place, the new technocrats must be trained to correctly interpret all this data... analytics training at major university business schools... (to ensure that the data are interpreted correctly: e.g., with the correct agenda driving the interpretation).

Remember Zbigniew Brzezinski's words...

“The technotronic era involves the gradual appearance of a more controlled society.

Such a society would be dominated by an elite, unrestrained by traditional values. [...]

The capacity to assert social and political control over the individual will vastly increase.

It will soon be possible to assert almost continuous surveillance over every citizen and to maintain up-to-date, complete files, containing even most personal information about the health or personal behavior of the citizen in addition to more customary data. These files will be subject to instantaneous retrieval by the authorities.”


This is the very definition of "Big Data".

It requires "specially trained managers" to interpret it.

The average lowlife humanoid can't POSSIBLY use their analytical skills to make decisions using the "Big Data", so we will of course need specially-indoctrinated, er.. specially-trained managers from elite business schools to do our interpreting for us.
And  the King shall answer and say unto them, Verily I say unto you, 
Inasmuch as ye have done it unto one of the least of these my brethren,  ye have done it unto me.

Matthew 25:40

Offline Dig

  • All eyes are opened, or opening, to the rights of man.
  • Member
  • *****
  • Posts: 63,090
    • Git Ureself Edumacated
Nazi War Criminal Spills the Beans in 1946 about BIG DATA and IBM was the main contractor then as it is now.

THIS IS A TRANSCRIPT OF THE ACTUAL NUREMBERG TRIAL IN 1946

NOTE: Albert Speer was the head of all armaments for Hitler. When he took over the position, he transformed the armament and bombarding operations by introducing a system of 'rationalization' into the war operations by including businessmen into the process of evaluating war product development and goals of 'efficiency'. His operations led to the development of so called 'vengence' weapons like the V2. Although he lied his ass off regarding his knowledge of death camps, he was perhaps one of the greatest whistleblowers regarding a cybernetic/technogratic enslavement system to come. It is serendipitous that Kennedy hired McNamara as Secretary of Defense and that the power elite in the US used McNamara's similar skills set as Speer to engage in 'rational' genocides in SE Asia. Today we are seeing that our entire defense operations are being controlled by technocrats with similar visions of global cbernetic enslavement.
http://www.nizkor.org/hweb/imt/tgmwc/tgmwc-22/tgmwc-22-216-07.shtml

THE PRESIDENT: I call on the defendant Albert Speer.

DEFENDANT SPEER: Mr. President, may it please the Tribunal: Hitler and the collapse of his system have brought a time of tremendous suffering upon the German people. The useless continuation of this war and the unnecessary destruction make the work of reconstruction more difficult. Privation and misery have come to the German people. After this trial, the German people will despise and condemn Hitler as the proved author of its misfortune. But the world will learn from these happenings not only to hate dictatorship as a form of government, but to fear it.

Hitler's dictatorship differed in one fundamental point from all its predecessors in history.

His was the first dictatorship in the present period of modern technical development,

a dictatorship which made a complete use of all technical means in a perfect manner for the domination of its own country.


Through technical devices like the radio and the loudspeaker, eighty million people were deprived of independent thought. It was thereby possible to subject them to the will of one man. The telephone, teletype and radio made it possible, for instance, that orders from the highest sources could be transmitted directly to the lowest ranking units, by whom, because of the high authority, they were carried out without criticism. From this it resulted that numerous offices and headquarters were directly attached to the supreme leadership, from which they received their sinister orders directly. Another result was the far- reaching supervision of the citizens of the State and the maintenance of a high degree of secrecy for criminal events.

Perhaps to the outsider this machinery of the State may appear like the cables of a telephone exchange - apparently without system. But, like the latter, it could be served and dominated by one single will.

Earlier dictators during their work of leadership needed highly qualified assistants, even at the lowest level, men who could think and act independently. The totalitarian system in the period of modern technical development can dispense with them; the means of communication alone make it possible to mechanize the lower leadership. As a result of this there arises the new type of the uncritical recipient of orders.

We had only reached the beginning of the development. The nightmare of many a man that one day nations could be dominated by technical means was all but realized in Hitler's totalitarian system.

Today the danger of being terrorized by technocracy threatens every country in the world. In modern dictatorship this appears to me inevitable. Therefore, the more technical the world becomes, the more necessary is the promotion of individual freedom and the individual's awareness of himself as a counterbalance.

Hitler not only took advantage of technical developments to dominate his own people - he nearly succeeded, by means of his technical lead, in subjugating the whole of Europe. It was merely due to a few fundamental shortcomings of organization, such as are typical in a dictatorship because of the absence of criticism, that he did not have twice as many tanks, aircraft, and submarines before 1942.

But if a modern industrial State utilizes its intelligence, its science, its technical developments and its production for a number of years in order to gain a lead in the sphere of armament, then, even with a sparing use of its manpower, it can, because of its technical superiority, completely overtake and conquer the world, if other nations should employ their technical abilities during that same period only on behalf of the cultural progress of humanity.

The more technical the world becomes, the greater this danger will be, and the more serious will be an established lead in the technical means of warfare.

This war ended with remote-controlled rockets, aircraft with the speed of sound, new types of submarines, torpedoes which find their own targets, with atom bombs, and with the prospect of a horrible kind of chemical warfare.

Of necessity the next war will be overshadowed by these new destructive inventions of the human mind.

In five to ten years the technique of warfare will make it possible to fire rockets from continent to continent with uncanny precision. By atomic fission it can destroy one million people in the centre of New York in a matter of seconds with a rocket manned, perhaps, by only ten men, invisible, without previous warning, faster than sound. Science is able to spread pestilence among human beings and animals and to destroy crops by insect warfare. Chemistry has developed terrible weapons with which it can inflict unspeakable suffering upon helpless human beings.

Will there ever again be a nation which will use the technical discoveries of this war for the preparation of a new war, while the rest of the world is employing the technical progress of this war for the benefit of humanity, thus attempting to create a slight compensation for its horrors?

As a former minister of a highly developed armament system, it is my last duty to say the following:

A new large-scale war will end with the destruction of human culture and civilization. Nothing prevents unconfined technique and science from completing the work of destroying human beings, which it has begun in so dreadful a way in this war.

All eyes are opened, or opening, to the rights of man. The general spread of the light of science has already laid open to every view the palpable truth, that the mass of mankind has not been born with saddles on their backs, nor a favored few booted and spurred, ready to ride them legitimately