Information Theory INFORMATION THEORY AND THE DIGITAL AGE

pdf
Số trang Information Theory INFORMATION THEORY AND THE DIGITAL AGE 27 Cỡ tệp Information Theory INFORMATION THEORY AND THE DIGITAL AGE 440 KB Lượt tải Information Theory INFORMATION THEORY AND THE DIGITAL AGE 0 Lượt đọc Information Theory INFORMATION THEORY AND THE DIGITAL AGE 0
Đánh giá Information Theory INFORMATION THEORY AND THE DIGITAL AGE
4.9 ( 21 lượt)
Nhấn vào bên dưới để tải tài liệu
Đang xem trước 10 trên tổng 27 trang, để tải xuống xem đầy đủ hãy nhấn vào bên trên
Chủ đề liên quan

Nội dung

Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION Information Theory INFORMATION THEORY AND THE DIGITAL AGE AFTAB, CHEUNG, KIM, THAKKAR, YEDDANAPUDI 6.933 – FINAL PAPER 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU 1 Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 2 INTRODUCTION Information Theory is one of the few scientific fields fortunate enough to have an identifiable beginning - Claude Shannon's 1948 paper. The story of the evolution of how it progressed from a single theoretical paper to a broad field that has redefined our world is a fascinating one. It provides the opportunity to study the social, political, and technological interactions that have helped guide its development and define its trajectory, and gives us insight into how a new field evolves. We often hear Claude Shannon called the father of the Digital Age. In the beginning of his paper Shannon acknowledges the work done before him, by such pioneers as Harry Nyquist and RVL. Hartley at Bell Labs in the 1920s. Though their influence was profound, the work of those early pioneers was limited and focussed on their own particular applications. It was Shannon’s unifying vision that revolutionized communication, and spawned a multitude of communication research that we now define as the field of Information Theory. One of those key concepts was his definition of the limit for channel capacity. Similar to Moore’s Law, the Shannon limit can be considered a self-fulfilling prophecy. It is a benchmark that tells people what can be done, and what remains to be done – compelling them to achieve it. What made possible, what induced the development of coding as a theory, and the development of very complicated codes, was Shannon's Theorem: he told you that it could be done, so people tried to do it. [Interview with Fano, R. 2001] In the course of our story, we explore how the area of coding, in particular, evolves to reach this limit. It was the realization that we were not even close to it that renewed the interest in communications research. Information Theory was not just a product of the work of Claude Shannon. It was the result of crucial contributions made by many distinct individuals, from a variety of backgrounds, who took his ideas and expanded upon them. Indeed the diversity and directions of their perspectives and interests shaped the direction of Information Theory. In the beginning, research was primarily theoretical, with little perceived practical applications. Christensen says that the innovator's dilemma is that he cannot garner support for his new ideas because he cannot always guarantee an end profit. Fortunately, Information Theory was sponsored in anticipation of what it could provide. This perseverance and continued interest eventually resulted in the multitude of technologies we have today. In this paper, we explore how these themes and concepts manifest in the trajectory of Information Theory. It begins as a broad spectrum of fields, from management to biology, all believing Information Theory to be a 'magic key' to multidisciplinary understanding. As the field moved from this initial chaos, various influences narrowed its focus. Within these established boundaries, external influences such as the space race steered the progress of the field. Through it all, the expansion of Information Theory was constantly controlled by hardware 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 3 technological limitations – indeed, the lack of such technology caused the ‘death’ of Information Theory, and its widespread availability is behind its current overwhelming success. SHANNON’S “MATHEMATICAL THEORY OF COMMUNICATION” “Before 1948, there was only the fuzziest idea of what a message was. There was some rudimentary understanding of how to transmit a waveform and process a received waveform, but there was essentially no understanding of how to turn a message into a transmitted waveform.” [Gallager, Claude Shannon: A Retrospective, 2001 pg. 2683] In 1948, Shannon published his paper “A Mathematical Theory of Communication” in the Bell Systems Technical Journal. He showed how information could be quantified with absolute precision, and demonstrated the essential unity of all information media. Telephone signals, text, radio waves, and pictures, essentially every mode of communication, could be encoded in bits. The paper provided a “blueprint for the digital age”1 Since the Bell Systems Technical Journal was targeted only towards communication engineers, mathematician Warren Weaver “had the feeling that this ought to reach a wider audience than (just) people in the field” recalls Betty Shannon2. He met with Shannon, and together, they published “The Mathematical Theory of Communication” in 1949. The change from “A” to “The” established Shannon’s paper as the new “scripture” on the subject – it allowed to reach a far wider group of people. Why was Shannon’s paper so influential? What was it about this paper that people refer to it as one of the greatest intellectual triumphs of the twentieth century? The answer lies in the groundbreaking concepts that A Mathematical Theory of Communication contains. Concepts that were influential enough to help change the world. There are actually four major concepts in Shannon’s paper. Getting an idea of each is essential in understanding the impact of Information Theory. Channel Capacity & The Noisy Channel Coding Theorem Perhaps the most eminent of Shannon’s results was the concept that every communication channel had a speed limit, measured in binary digits per second: this is the famous Shannon Limit, exemplified by the famous and familiar formula for the capacity of a White Gaussian Noise Channel: C t = W log 2 1 2 P+N N Gallager, R. Quoted in Technology Review, Shannon, B. Phone Interview 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 4 The bad news is that it is mathematically impossible to get error free communication above the limit. No matter how sophisticated an error correction scheme you use, no matter how much you can compress the data, you can not make the channel go faster than the limit without losing some information. The good news is that below the Shannon Limit, it is possible to transmit information with zero error. Shannon mathematically proved that there were ways of encoding information that would allow one to get up to the limit without any errors: regardless of the amount of noise or static, or how faint the signal was. Of course, one might need to encode the information with more and more bits, so that most of them would get through and those lost could be regenerated from the others. The increased complexity and length of the message would make communication slower and slower, but essentially, below the limit, you could make the probability of error as low as you wanted. To make the chance of error as small as you wish? Nobody had ever thought of that. How he got that insight, how he even came to believe such a thing, I don't know. But almost all modern communication engineering is based on that work. [Fano, R. Quoted in Technology Review, Jul 2001] The noisy channel coding theorem is what gave rise to the entire field of error-correcting codes and channel coding theory: the concept of introducing redundancy into the digital representation to protect against corruption. Today if you take a CD, scratch it with a knife, and play it back it will play back perfectly. That’s thanks to the noisy channel theorem. Formal Architecture of Communication Systems The following diagram illustrates the formal architecture Shannon offered as a schematic for a general communication system. Flip open to the beginning of any random textbook on communications, or even a paper or a monograph, and you will find this diagram. Figure 1. From Shannon’s “A Mathematical Theory of Communication”, page 3. This figure represents one of the great contributions of A Mathematical Theory of Communication: the architecture and design of communication systems. It demonstrates that any 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 5 communication system can be separated into components, which can be treated independently as distinct mathematical models. Thus, it is possible to completely separate the design of the source from the design of the channel. Shannon himself, realized that his model had “applications not only in communication theory, but also in the theory of computing machines, the design of telephone exchanges and other fields.”3 All of today’s communication systems are essentially based on this model – it is truly ‘a blueprint for the digital age’ Digital Representation Shannon also realized that the content of the message was irrelevant to its transmission: it did not matter what the message represented. It could be text, sound, image, or video, but it was all 0’s and 1’s to the channel. In a follow-up paper, Shannon also pointed out that once data was represented digitally, it could be regenerated and transmitted without error. This was a radical idea to engineers who were used to thinking of transmitting information as an electromagnetic waveform over a wire. Before Shannon, communication engineers worked on their own distinct fields, each with its own distinct techniques: telegraphy, telephony, audio and data transmission all had nothing to do with each other. Shannon’s vision unified all of communication engineering, establishing that text, telephone signals, images and film – all modes of communication – could be encoded in bits, a term that was first used in print in his article. This digital representation is the fundamental basis of all we have today. Efficiency of Representation: Source Coding In his paper, Shannon also discusses source coding, which deals with efficient representation of data. Today the term is synonymous with data compression. The basic objective of source coding is to remove redundancy in the information to make the message smaller. In his exposition, he discusses a loss-less method of compressing data at the source, using a variable rate block code, later called a Shannon-Fano code. A challenge raised by Shannon in his 1948 paper was the design of a code that was optimal in the sense that it would minimize the expected length. (The Shannon-Fano code which he introduced is not always optimal). Three years later, David Huffman, a student of Prof. Fano’s class at MIT came up with Huffman Coding, which is widely used for data compression. JPEGS, MP3s and .ZIP files are only some examples. Entropy & Information Content As we’ve discussed, Shannon’s paper expressed the capacity of a channel: defining the amount of information that can be sent down a noisy channel in terms of transmit power and bandwidth. In doing so, Shannon showed that engineers could choose to send a given amount of information using high power and low bandwidth, or high bandwidth and low power. 3 Shannon, C. A Mathematical Theory of Communication, pg. 3 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 6 The traditional solution was to use narrow-band radios, which would focus all their power into a small range of frequencies. The problem was that as the number of users increased, the number of channels began to be used up. Additionally, such radios were highly susceptible to interference: so much power was confined to a small portion of the spectrum that a single interfering signal in the frequency range could disrupt communication Shannon offered a solution to this problem by redefining the relationship between information, noise and power. Shannon quantified the amount of information in a signal, stating that is the amount of unexpected data the message contains. He called this information content of a message ‘entropy’. In digital communication a stream of unexpected bits is just random noise. Shannon showed that the more a transmission resembles random noise, the more information it can hold, as long as it is modulated to an appropriate carrier: one needs a low entropy carrier to carry a high entropy message. Thus Shannon stated that an alternative to narrow-band radios was sending a message with low power, spread over a wide bandwidth. Spread spectrum is just such a technique: it takes a narrow band signal and spreads its power over a wide band of frequencies. This makes it incredibly resistant to interference. However it does use additional frequency ranges, and thus the FCC until recently had confined the technique to the military. It is now widely used in CDMA cellular phones. Now that we’ve discussed some of the fundamental concepts in Shannon’s work, let’s take a step back and see how the formalization of these concepts started a chain of research that eventually became known as the field of Information Theory. TRAJECTORY OF INFORMATION THEORY - I We begin by exploring the history of Information Theory, how the field evolved and weathered various influences to become what it is today. In essence, we chart the trajectory of a new science. Creating the Field Information Theory grew out of the concepts introduced in "A Mathematical Theory of Communication." Although, the phrase "information theory" was never used in the paper, Shannon's emphasis on the word "information" probably helped coin the term. The idea that something as nebulous as "information" could be quantified, analyzed, and reduced to a mathematical formula attracted tremendous attention. This initial excitement gave life to the field. But what were the forces that enabled this process? According to Latour, one of the tasks in creating a new field is gathering the support and enthusiasm of the masses4. Although Shannon had intended his audience to be confined to communication engineering, his concepts and methodology of thinking quickly moved into the popular press. 1953’s Fortune magazine gushingly describes the field as more crucial to ‘man's progress in peace, and security in war’ than Einstein’s nuclear physics. 4 Latour. B, Science in Action, pg. 150 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 7 Perhaps, without popular support and interest of researchers from other fields, Information Theory may not have existed as it does today. Another task in creating a new field is to recruit amateurs for the research workforce5. As previously mentioned, Shannon's 1948 paper attracted a multitude of individuals to conduct Information Theory research. At the time, these researchers were all amateurs to whom Shannon's paper had opened up entirely new ways of tackling the problem of transmission of information. However, these amateurs soon become the experts6 and subsequently guided the direction of the field. Circulation and Propagation of Ideas Identifying the factors that transformed a single paper to a flourishing field, requires an investigation into the activities that occurred soon after Shannon introduced his theory. Initially there was an absolute fervor of excitement. Universities began to offer seminars which later developed into classes. The Institute of Radio Engineers, or IRE7, published papers on current research in a journal meant to focus solely on Information Theory, and formed a group called the Professional Group on Information Theory, or the PGIT. In addition, symposia were organized to present these papers and to allow forum discussions. Amidst all the initial enthusiasm, many felt that with all the new concepts and research being generated, there was a need for a younger generation to get involved. As a result, several seminars and departments were organized at different universities such as University of Michigan and Universite di Napoli. These seminars later developed into classes, which had an influence on the field because they discussed current research questions, and produced graduate students who would eventually become the field’s new practitioners. Professor Fano, in fact, taught one of the first courses, 6.574 commonly known as the ‘Information Theory Course’, at MIT. In his early lectures, Fano began by acknowledging that his subject matter was yet to be fully defined: Let's start by specifying a model of communication system to which the theory to be developed shall apply… This model should be sufficiently general to include, as special cases, most of the communication systems of practical interest, yet simple enough to lend itself to a detailed quantitative study. [Fano, R. 6.574 lecture notes, MIT Archives] At the time, Professor Fano taught his class using the current research and its directions as his source of teaching material. He drew from here his assigned readings, problem sets, exams and final project questions. In fact, Huffman Coding, a form of efficient representation, originated from a final paper that Fano assigned. A second course, 6.575 "Advanced Topics in Information Theory," was later taught by Shannon himself after he took professorship at MIT in 1956. Professor G. David Forney, Jr. credits this course "as the direct cause of his return to Information Theory."8 5 Latour, B. Science in Action, pg. 150 Eden, M. Interview 7 IRE, The Institute of Radio Engineers later merged with AIEE, American Institute of Electrical Engineers on January 1, 1963 to form the IEEE 8 IT Society Newsletter, pg. 21 6 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 8 Today, although neither an Information Theory department, nor a specific program exists within the EECS department at MIT. The field has become too ubiquitous, and its off-shoots are taught under a multitude of different areas: Computer Science, Information Technology, Electrical Engineering, Mathematics. Moreover, the concepts developed through Information Theory research have been integrated into the course material of different engineering disciplines. The "Information Theory Course" numbered 6.574 still exists today in the form of 6.441 "Transmission of Information." We see that, as if following Latour's counsel, Information Theory quickly found its way into the curriculum at various educational institutions, and Shannon secured a university position. These are two more tasks that Latour considers important to creating a field9. Education did not only take place in the classroom though. The IRE Transactions on Information Theory became a journal whose "primary purpose [was] associated with the word 'education' and more specifically, the education of the PGIT membership in tune with current interests and trends"10. As a well-known, well-read, and well-respected journal, it had a great deal of control over the information and research that reached its readers. The Transactions, in a way guided the field by the research it chose to present in its publications. It published editorials by respected scientists in the field including such influential voices such as Claude Shannon, Peter Elias, and Norbert Wiener. Its correspondence section served as a written forum of discussion containing comments and reactions to published materials, either within the journal or elsewhere. In addition to classes and the IRE journals, early symposia played a key role in the growth of Information Theory. The purpose of the symposia was to introduce cutting edge research and to foster an atmosphere of education and discussion. For these symposia, the organizers searched for the "cream of the crop" in terms of papers; leaving out tutorials and reviews. Abstracts were submitted by many individuals from various areas of research and reviewed by a committee who judged whether the material was within the scope of the conference. Much effort was expended to keep the quality of research as high as possible. We should note that although this selection process was necessary to obtain worthy papers within the interests of the attendees, it opened the possibility of being biased toward the interests of the members of the organizing committee. Despite the selection process the early symposia reflected a broadening in scope and an explosion of excitement. In the first London Symposium held in 1950, six out of the twenty papers presented were about psychology and neurophysiology. This number increased to eight by the time of the second symposium. But by the third held in 1956, the scope was so wide that it included participants with backgrounds in fields as diverse as "anatomy, animal welfare, 9 Latour, B. Science in Action, pg. 150 Cheathem, T. A Broader Base for the PGIT, IEEE Transactions, 1958, pg. 135 10 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 9 anthropology, computers, economics, electronics, linguistics, mathematics, neuropsychiatry, neurophysiology, philosophy, phonetics, physics, political theory, psychology, and statistics."11 Bandwagon In the mid-50's, it was becoming apparent that Information Theory had become somewhat of a fad. This was because of confusion as to what Information Theory truly was. I didn’t like the term Information Theory. Claude didn’t like it either. You see, the term ‘information theory’ suggests that it is a theory about information – but it’s not. It’s the transmission of information, not information. Lots of people just didn’t understand this… I coined the term ‘mutual information’ to avoid such nonsense: making the point that information is always about something. It is information provided by something, about something. [Interview with Fano, R. 2001] Such misconceptions, together with the belief that Information Theory would serve as a unifying agent across a diverse array of disciplines led some researchers to attempt to apply Information Theory terminology to some of the most random of fields. …birds clearly have the problem of communicating in the presence of noise… an examination of birdsong on the basis of information theory might… suggest new types of field experiment and analysis... [Bates, J. “Significance of Information Theory to Neurophysiology.” Feb1953: pg. 142] Countless shallow articles based on 'non-engineering' fields were being published in the IRE Transactions at the time. Worse yet, researchers would deliberately introduce the words ‘Information Theory’ or ‘Cybernetics’ as it was alternatively called, into their work in hopes of attracting funding. These blind attempts to apply Information Theory to 'everything under the sun' created a great deal of controversy within the PGIT about what the bounds of the field should be. In December of 1955, L.A. De Rosa, chairman of the PGIT, formalized these tensions in an editorial titled "In Which Fields Do We Graze?" Should an attempt be made to extend our interests to such fields as management, biology, psychology, and linguistic theory, or should the concentration be strictly in the direction of communication by radio or wire? [De Rosa, L.A. “In Which Fields Do We Graze?” Dec 1955:2] PGIT members were divided. Some believed that if knowledge and application of Information Theory was not extended beyond radio and wire communications, progress in other fields could be delayed or stunted. By broadening the scope of PGIT, knowledge would be shared with other areas. Others insisted on confining the field to developments in radio, electronics, and wire communications. The two points of view were hotly debated over the next few years through correspondence in the Transactions and elsewhere. This is a clear example of the Great Divide, as it is defined by Latour12. The PGIT is a scientific network. Within the PGIT, there existed an inner and outer network. Latour's "insiders" consist 11 Blachman, N. A report on the third London Symposium, IEEE Transactions, March 1956, pg. 17 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU Aftab, Cheung, Kim, Thakkar, Yeddanapudi INFORMATION THEORY & THE DIGITAL REVOLUTION 10 of the members who believed that Information Theory should be confined to communications engineers (or the purists). The "outsiders," of course, are the members who supported expanding Information Theory to other fields. In the Great Divide, the insiders do not believe that the outsiders have a correct understanding of the nature of the field. By 1956, the debate had become heated enough that the father of the field had to address it. In his March editorial, “The Bandwagon” Claude Shannon responded to De Rosa's question, taking the side of the purists. He wrote in his usual gentle fashion, but showed signs of frustration at the state of Information Theory. Shannon felt that Information Theory had "ballooned" into more than it actually was, because of its novelty and popular exposure. Shannon's wife, Betty Shannon, commented, "He got a little irritated with the way people were pulling it around. People didn't understand what he was trying to do."13 Shannon had intended the theory to be directed in a very specific manner, and therefore believed that it may not be relevant to other disciplines. Moreover, he believed that the IRE Transactions, being an academic journal, should require more carefully researched papers that would appropriately – and not just superficially – apply Information Theory and do so in a more rigorous manner. A thorough understanding of the mathematical foundation and its communication application is surely a prerequisite to other applications. I personally believe that many of the concepts of information theory will prove useful in these other fields-and, indeed, some results are already quite promising-but the establishing of such applications is not a trivial matter of translating words to a new domain, but rather the slow tedious process of hypothesis and experimental verification. [Shannon, “The Bandwagon” March 1956] Norbert Wiener, another influential member of the PGIT, also agreed with Shannon that the concept was being wrongly thought of as the solution to all informational problems. ...As Dr. Shannon suggests in his editorial: The Bandwagon, [Information Theory] is beginning to suffer from the indiscriminate way in which it has been taken as a solution of all informational problems, a sort of magic key. I am pleading in this editorial that Information Theory... return to the point of view from which it originated: the … statistical concept of communication. [Wiener, “What Is Information Theory?” June 1956] Such editorials made the views of the core of the PGIT clear. We see a rapid reduction in the number of ‘fluffy’ papers in the Transactions – the topics increasingly become focussed on new research in communication engineering. By 1958, the fate of the field had pretty much been decided. Peter Elias's scathing 1958 editorial "Two Famous Papers" crystallized the "great divide". He took a much harsher stance than Shannon’s in describing a typical paper that should not be published: The first paper has the generic title 'Information Theory, Photosynthesis and Religion'… written by an engineer or physicist… I suggest that we stop writing [it], and release a large supply of man power to work on… important problems which need investigation. 12 13 Latour, B. Science in Action, pg. 211 Shannon, B. Phone Interview 6.933 Project History, Massachusetts Institute of Technology SNAPES@MIT.EDU
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.