Text Analysis Assignment

Calculate the entropy of the Universal Declaration of Human Rights (UDHR), assuming all letters are distributed independently.

To this end, you can use the following facilities:

Instructions:

  1. Open Tom's JavaScript Machine.
  2. Enable Advanced mode.
  3. Copy the UDHR into the (yellow) input box.
  4. Copy the program count_text.js in the (green) program box.
  5. Click Run, to get letter frequencies in the (blue) output box.
    N.B. Spaces are counted as well, but adjacent whitespace is merged.
  6. Click Output: Copy to Input.
  7. Copy the program freq2prob.js into the program box.
  8. Click Run, to get letter probabilities in the output box.
  9. Click Output: Copy to Input.
  10. Copy entropy.js into the program box.
  11. Click Run, to get the entropy in the output box.

If you are up to it, you can merge the three programs, so that you can obtain the frequencies, probabilities and entropy in a single run.

Also do this for some other texts.


©2014, Tom Verhoeff (TUE)
Feedback about this page is welcome