ECE/TCOM5583: Information Theory
This is a graduate introductory course on information theory. I have offered this course for almost ten years now. And I have decided to make some significant changes of materials for this offering. Past materials such as Kraft inequality and separation theorem are intellectually interesting. Unfortunately I found that most of my students (including myself) would not end up becoming information theorists and thus most of them do not feel connected with the topics.
Therefore, I plan to follow the approach of Professor Mackay for this offering. That is, try to incorporate core information theory materials (for example, information measures, AEP, source and channel coding theory) into statistical learning. Therefore, the main textbook will be Professor Mackay's textbook — Information Theory, Inference, and Learning Algorithms. Even that, I probably will not follow this book very closely. And I incline to borrow heavily with materials from the Internet. Two other most important reference texts are
And a nice reference on network information theory is
However, we won't go into network information theory for this introductory course.
Other Reference Materials
Shannon, C. E. (1948), A mathematical theory of communication. Bell Sys. Tech. J. 27: 379423, 623656.
R. W. Yeung, On entropy, information inequalities, and Groups.
Law of Large Number by Terry Tao.
A First Course in Information Theory, by Raymond W. Yeung, New York: Springer.
Information Theory and Reliable Communication by R. Gallager, New York: Wiley.
Information Theory by Csiszar and Korner, New York: Academic Press.
Entropy and Information Theory by R. M. Gray, SpringerVerlag, 1990.
Probability, Random Processes, and Ergodic Properties by R. M. Gray, SpringerVerlag, 1988.
J. S. Yedidia, W. T. Freeman, and Y. Weiss, Understanding Belief Propagation and its Generalizations in Exploring Artificial Intelligence in the New Millennium: Science and Technology Books, 2003.
S. Verdu, "Fifty years of Shannon theory," Information Theory, IEEE Transactions on, vol. 44, pp. 20572078, 1998.
I. Csiszar, The method of types, Information Theory, IEEE Transactions on, vol. 44, pp. 25052523, 1998.
Information Theoretic Inequality Prover
Link to other course materials
Office Hours
There are no “regular” office hours. But you are very likely to be able to find me everyday in the afternoon. And you are welcome to come catch me anytime or contact me through emails.
Course Syllabus (Tentative)
Probability review
Maximum likelihood estimator, MAP, Bayesian estimator
Graphical models and message passing algorithms
Lossless source coding theory, Huffmann coding, and introduction to Arithmetic coding
Asymptotic equipartition property (AEP), typicality and joint typicality
Entropy, conditional entropy, mutual information, and their properties
Channel coding theory, capacity, and Fano’s inequality
Continuous random variables, differential entropy, Gaussian source, and Gaussian channel
Error correcting codes, linear codes, and introduction to lowdensity parity check code
Methods of type, large deviation theory, maximum entropy principle
N.B. You will expect to expose to some Python and Matlab. You won't become an expert on these things after this class. But it is good to get your hands dirty and play with them early.
Projects
Grading
Homework: 20%.
"Midterm": 30%. Closed book test, and no Internet and cellphones, but you can bring calculator and two pieces of lettersize cheat sheet
Final Project: 50%.
Calendar
 Topics  Materials 
8/25  Overview of IT, Probability Overview, ML, MAP, Bayesian inference  Slides from Berkeley CS188, Slides from Wash U CSE 473 
9/1  Conjugate prior, Bernoulli distribution, Binomial distribution, Poisson distribution, Beta distribution, summary  Jupyter notebook on distribution 
9/8  Multinormial distribution, Dirichlet distribution, Normal distribution, overview of Source Coding Theorem, Huffmann coding, Lagrange multiplier, summary  
9/15  KarushKuhnTucker conditions, Kraft's inequality, ShannonFanoElias codes, proof of Source Coding Theorem, summary  
9/22  Different entropy, Jensen's inequality, joint entropy, conditional entropy, summary  
9/29  KullbackLeibler divergence, mutual information, cross entropy, Application example: TFIDF, summary  TFIDF notebook by Mike Bernico 
10/6  Decison trees, random forests, law of large number, asymptoic equipartion (AEP), typical sequences, summary  Random Forest notebook by Jason Sanchez 
10/13  Joint typical sequences, packing and covering lemmas, channel capacity, Channel Coding Theorem, summary  
10/20  Converse proof of Channel Coding Theorem, raedistortion problem, ratedistortion theorem, summary  
10/27  Method of types, universal source coding, large deviation theory, summary  
11/3  Review, summary  
11/10  MidTerm  
11/17  Review of midterm solution  
11/24  Thanksgiving holiday  
12/1  Overview of reinforcement learning, Markov decision process, Qlearning  Slides from Berkeley AI courses: MDP I, MDP II, Reinforcement learning I, Reinforcement learning II 
12/8  Bayesian net, belief propagation algorithm, IRA/LDPC codes, summary  Bayesian network examples, LDPC decoding in Python

