Page 163 - AJWEP-22-5
P. 163

Emission and carbon policy analysis

                  For each topic t in the set T of topics, the probability   b.  SBC process
                of generating different words is represented by the vector        The SBC consists of a Sankey chart and a bubble
                φ  = < p ,…,p >, where  p  denotes  the  likelihood     chart,  which  together  can  more  effectively  and
                                          wi
                 t
                              wm
                        w1
                of topic  t  generating the  i-th word in the VOC. This   comprehensively  display  the  multi-dimensional
                probability is calculated as:                           information within the data. The Sankey diagram
                                                                        shows the relationships between different pathways
                      N
                p =   N wi                                    (II)      and  genes  – each  pathway  represents  the  gene
                  wi
                                                                        distribution within that pathway. In the bubble chart,
                                                                        the position of each bubble indicates the GeneRatio,
                  Here, N  denotes the number of times the i-th word    the size represents the number of genes, and the
                         wi
                in VOC is assigned to topic t, and N represents the total   color reflects the p-value. The Sankey diagram and
                number of words associated with topic t.                bubble  chart  are  connected  by  multiple  signaling
                  The core formula of LDA is given by:                  pathways. In the Sankey  diagram,  different  or
                p(w|d) = p(w|t) ×p(t|d)                      (III)      identical genes contained within each pathway are
                                                                        shown, while  in  the  bubble  chart, the  position  of
                  Using the topic as an intermediate layer, the current   each bubble indicates the negative logarithmic of
                values of  θ  and  φt enable the calculation  of the    the p-value or Q-value of a pathway, the bubble size
                           d
                probability of occurrence of a word w in document d,    corresponds to the number of genes enriched in the
                where p(t|d) is computed using θ , and p(w|t) is derived   pathway, and the color reflects the pathway’s  Hit
                                             d
                from φt. In addition, with the current values of θ  and   Ratio, among other attributes. 30
                                                            d
                φt, it is possible to calculate p(w|d) by considering all
                possible topics that a word in a document might belong   4. Empirical analyses
                to. Based on this result, the topic assignment for the
                word is updated. If the topic assignment changes, it will,   4.1. Data acquisition and processing
                in turn, affect both θ  and φt. LDA initializes θ  and φt   The study employed custom-built  web crawlers in
                                  d
                                                          d
                with random values and iteratively updates them until   a Python 3.7 (Python Software Foundation, USA)
                the model converges. 28                             environment to collect emission and carbon reduction
                                                                    policy  documents  from  the  websites  of the  National
                3.2. SBC                                            Development and Reform Commission, the Ministry
                a.  Principle of Sankey-bubble diagram              of Agriculture and Rural Affairs, the National Bureau
                     The  Sankey  diagram, also known as the Sankey   of Statistics,  the  National  Energy Administration,  the
                   energy  flow  diagram or Sankey energy balance   Ministry of Science and Technology, and the Ministry
                   diagram, was first introduced in 1898 by Matthew   of Ecology and Environment, covering the period from
                   Henry  Phineas  Riall  Sankey.  He  created  a  now-  2016 to 2023. These documents served as samples for
                   famous  diagram  called  the  “energy  efficiency  of   analyzing relevant policies and regulations.
                   the steam engine,” and since then, the visualization   Considering the diverse and interconnected
                   has been named after the “Sankey  diagram.”      functions of Chinese government departments,  this
                   Sankey-bubble  diagrams combine  the  features  of   study  collected  carbon  emission  reduction  data  from
                   a bubble chart. Bubble charts can be used to show   2016 to 2023 via official government websites, enabling
                   correlations between three measures or to present   a comprehensive analysis of emission reduction efforts
                   three-dimensional data simultaneously.           at the departmental level.
                     The  SBC integrates  a  Sankey  diagram with  a   Figure  1 illustrates the changing pattern in the
                   bubble chart to more effectively visualize complex,   number of policy documents issued between 2016 and
                   multi-dimensional  data.  These two  charts are   2023. Although there are fluctuations in issuance across
                   linked through the names of pathways. The Sankey   various departments, the overall trajectory suggests
                   diagram  displays  the  genes  associated  with  each   an upward trend, with 70 documents released in 2023,
                   pathway, while the bubble chart represents multiple   reflecting  China’s  increasing  focus  on  emission  and
                   attributes: the position of each bubble corresponds   carbon reduction as a key aspect of national development.
                   to the GeneRatio, the size of the bubble indicates   In terms of the total number of documents, the peak
                   the number of genes enriched in the pathway, and   occurred in 2021, with 115 documents published. During
                   the color represents the pathway’s p-value. 29   that year, the administrative arm of the State Council, the



                Volume 22 Issue 5 (2025)                       157                           doi: 10.36922/AJWEP025160117
   158   159   160   161   162   163   164   165   166   167   168