信息论基础——信源熵及其性质研究
本文主要研究了离散信源熵及其性质,包括熵的含义、计算方法以及其对称性、确定性、扩展性、极值性和上凸性。通过Matlab编程,实验验证了这些性质。具体而言,实验结果表明:
对称性:不同概率分布的信源熵相同;
确定性:当某一符号概率为1时,熵为0;
扩展性:增加符号数时,熵值增加;
极值性:熵在所有可能的概率分布中达到最大值;
上凸性:熵函数为上凸函数。
实验结果与理论分析一致,验证了离散信源熵的性质。
本文仅供学习使用,如有侵权请及时联系,博主会第一时间进行处理
信源熵及其性质研究
-
- 实验目标
-
- 实验原理与内容
-
- 实验设备与材料
-
- 实验操作步骤
- 4.1 实验初始条件设置
- 4.2 数据采集与处理方法
- 4.3 结果分析与验证
-
- 实验运行流程及结果呈现
-
- 实验总结与分析
一、实验目的
深入理解离散信源熵的概念,并熟练掌握其计算方法。
系统分析熵函数的性质及其在信息论中的物理意义。
深入探究熵的对称性、确定性、扩展性、极值性和上凸性等关键性质。
二、实验原理及内容
离散信源涉及的基本概念、理论基础和计算方法

离散信源熵的计算是基于整个信源的统计特性进行分析。该计算方法表征信源的总体信息测度,是基于平均意义的。对于特定的信源而言,其信息熵是一个固定的数值。由于不同信源具有不同的统计特性,因此它们的熵值也不同。离散信源的熵计算公式为:

1.基本要求
探讨两个符号离散信源的熵特性,通过图形展示离散信源熵值的分布曲线,要求横纵坐标标注物理量及其单位信息。并根据曲线分析离散信源所具有的相关特性。
2.扩展要求
离散信源的输出符号数量有限(可超过两个,可通过参数设置);用户可输入各符号的概率值;可验证信源符号概率总和是否为1,以确认输入概率的正确性;可计算并输出信源的熵值;通过熵值探究熵的对称性、确定性、扩展性、极值性。可附加自创内容。
三、实验设备与材料
在计算机领域,matlab软件平台是一个功能强大的工具。
四、实验步骤
首先,打开MATLAB编辑器窗口;然后,在窗口中输入具有注释和合理结构的源程序代码。确保源文件的保存路径与程序功能需求一致,避免随意更改。通过单击“Debug”菜单中的“Run”选项,或直接按下F5键来执行程序。运行程序后,分析结果。如果结果不符合设计要求,进行调试和改进。
五、实验程序及运行结果
基础部分实验程序:
p0=0;%Minimum Probability%
pd=1;%Maximum Probability%
N=100;%100 sampling points%
p=linspace(p0,pd,N);%100 sampling points constitute a linear vector%
pa=[1/2,1/2];%pa is the probability space of discrete memoryless sources%
pb=[1/4,3/4];%pb is the probability space of discrete memoryless sources%
entropya=0;%Initial Entropy of discrete memoryless sources is 0%
entropyb=0;%Initial Entropy of discrete memoryless sources is 0%
for i=1:2%Because there are two messages, the loop is repeated two times.%
entropya=entropya+pa(i)*log2(1/pa(i));%information entropy function%
entropyb=entropyb+pb(i)*log2(1/pb(i));%information entropy function%
end
disp('Entropy of discrete memory-less sources:');
entropya
disp('Entropy of discrete memory-less sources:');
entropyb
H=-(p).*log2(p)-(1-p).*log2(1-p);% entropy function %
plot(p,H)
title('entropy H(p)=-(p).*log2(p)-(1-p).*log2(1-p)')
xlabel('p');ylabel('H(p) (bit/symbol)')
基础部分运行结果:


扩展部分实验程序:
程序:
i=1:3;
pa=input('Probabilities of entering each symbol:pa=');
p(i)=pa;
if(abs(sum(p(i))-1)>0)
disp('The sum of probabilities is not 1');%Determine whether the compliance probability sum is 1
else
disp('The probability sum is 1');%Determine whether the compliance probability sum is 1
entropya=0;%Initial Entropy of Source A is 0%
I=[];%I is a discrete self-information matrix ofmemory-free sources.%
for i=1:3%Because there are three messages, the loop is repeated three times.%
entropya=entropya+pa(i)*log2(1/pa(i));%information entropy function%
I(i)=log2(1/pa(i));%self-information content%
end
disp('self-information content of discrete memoryless sources:');
I
disp('entropy of discrete memoryless sources:');
entropya
end
结果:

程序:
%symmetry%
pa=[1/2,1/4,1/8,1/8];%pa is the probability space of discrete mnemonic sources%
pb=[1/4,1/2,1/8,1/8];%pb is the probability space of discrete mnemonic sources%
pc=[1/4,1/8,1/2,1/8];%pc is the probability space of discrete mnemonic sources%
entropya=0;%The initial entropy of the discrete mnemonic source is 0%
entropyb=0;%The initial entropy of the discrete mnemonic source is 0%
entropyc=0;%The initial entropy of the discrete mnemonic source is 0%
Ia=[];%Ia is a discrete self-information matrix ofmemory-free sources.%
Ib=[];%Ib is a discrete self-information matrix ofmemory-free sources.%
Ic=[];%Ic is a discrete self-information matrix ofmemory-free sources.%
for i=1:4%Because there are four messages, the loop is repeated four times.%
entropya=entropya+pa(i)*log2(1/pa(i));%information entropy function%
Ia(i)=log2(1/pa(i));%self-information content%
entropyb=entropyb+pb(i)*log2(1/pb(i));%information entropy function%
Ib(i)=log2(1/pb(i));%self-information content%
entropyc=entropyc+pc(i)*log2(1/pc(i));%information entropy function%
Ic(i)=log2(1/pc(i));%self-information content%
end
disp('self-information content of discrete memoryless sources:');
Ia
disp('entropy of discrete memoryless sources:');
entropya
disp('self-information content of discrete memoryless sources:');
Ib
disp('entropy of discrete memoryless sources:');
entropyb
disp('self-information content of discrete memoryless sources:');
Ic
disp('entropy of discrete memoryless sources:');
entropyc
%determinacy%
pa=[1,0];%pa is the probability space of discrete mnemonic sources%
pb=[1,0,0];%pb is the probability space of discrete mnemonic sources%
pc=[1,0,0,0];%pc is the probability space of discrete mnemonic sources%
Ha=pa(1)*log2(1/pa(1))+0;%information entropy function%
disp('entropy of discrete memoryless sources:');
Ha
Hb=pb(1)*log2(1/pb(1))+0+0;%information entropy function%
disp('entropy of discrete memoryless sources:');
Hb
Hc=pc(1)*log2(1/pc(1))+0+0+0;%information entropy function%
disp('entropy of discrete memoryless sources:');
Hc
%extremum property,upper convexity%
p=0.00001:0.001:1;
H=-(p).*log2(p)-(1-p).*log2(1-p);
plot(p,H)
title('entropyH(p)=-(p).*log2(p)-(1-p).*log2(1-p)')
xlabel('p');ylabel('H(p) (bit/symbol)')
结果:


程序:
%expansibility%
pa=[1/4,1/4,1/8,1/8,1/8,1/8];%pa is the probability space of discrete mnemonic sources%
pb=[1/4,1/4,1/8,1/8,1/8,511/4096,1/4096];%pb is the probability space of discrete mnemonic sources%
entropya=0;%The initial entropy of the discrete mnemonic source is 0%
entropyb=0;%The initial entropy of the discrete mnemonic source is 0%
Ia=[];%Ia is a discrete self-information matrix ofmemory-free sources.%
Ib=[];%Ib is a discrete self-information matrix ofmemory-free sources.%
for i=1:6%Because there are six messages, the loop is repeated six times.%
entropya=entropya+pa(i)*log2(1/pa(i));%information entropy function%
Ia(i)=log2(1/pa(i));%self-information content%
end
disp('self-information content of discrete memoryless sources:');
Ia
disp('entropy of discrete memoryless sources:');
entropya
for i=1:7%Because there are seven messages, the loop is repeated seven times.%
entropyb=entropyb+pb(i)*log2(1/pb(i));%information entropy function%
Ib(i)=log2(1/pb(i));%self-information content%
end
disp('self-information content of discrete memoryless sources:');
Ib
disp('entropy of discrete memoryless sources:');
entropyb
结果:

六、实验总结
(1)基本部分的实验程序运行的结果中有一个曲线图,该曲线图的横坐标为概率p,纵坐标表示计算后的熵值,单位为(比特/符号)。由于研究的对象为输出两个符号的离散信源的熵,故一个符号对应的概率为p,则另一个符号对应的概率为(1-p)。曲线标题为熵函数的表达式:H(p)=-( p ).*log2( p )-( 1-p ).*log2( 1-p ).
从举的两个例子可以看出:在离散信源情况下,对于具有q个符号的离散信源,只有在q个信源符号等可能出现的情况下,信源熵才能达到最大值。即体现了极值性。由于其函数图像为上凸函数,故体现了上凸性。实验结果与预期效果一致。
(2)扩展部分的实验程序运行的结果中第一部分实现了判断是否符合输入概率和为1。若为1,则可继续求得自信息以及信息熵;否则结束运行。第二部分包括了三个随机变量的总体结构相同的概率空间的自信息以及信源熵,结果信源熵都相同,验证了对称性。第三部分验证了确定性。第四部分得到的曲线可验证极值性和上凸性。第五部分的运行结果验证了扩展性。实验结果与预期效果一致。
