390 lines
16 KiB
Plaintext
390 lines
16 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"using Downloads"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Download a Book from Internet\n",
|
||
"\n",
|
||
"and read it into a string in Julia. Then report the number of letters etc (see below)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"File to download: http://www.gutenberg.org/files/2600/2600-0.txt\n",
|
||
"\n",
|
||
"check the subfolder Results\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"if !isdir(\"Results\")\n",
|
||
" error(\"create the subfolder Results before running this program\")\n",
|
||
"end\n",
|
||
"\n",
|
||
"http = \"http://www.gutenberg.org/files/2600/2600-0.txt\"\n",
|
||
"\n",
|
||
"println(\"File to download: \",http)\n",
|
||
"Downloads.download(http,\"Results/WarAndPeace.txt\")\n",
|
||
"\n",
|
||
"println(\"\\ncheck the subfolder Results\\n\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"fh = open(\"Results/WarAndPeace.txt\")\n",
|
||
"str = read(fh,String) \n",
|
||
"close(fh)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Task 1\n",
|
||
"\n",
|
||
"1. Count the number of letters and the unique letters in `str`. Hint: `length() and `unique()`\n",
|
||
"\n",
|
||
"2. Count the number of word and lines. Hint: `split(str)` and `split(str,\"\\n\")`\n",
|
||
"\n",
|
||
"3. Count the number of unique words."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"number of letters and unique letters in W&P: 3293555 113\n",
|
||
"\n",
|
||
"unique letters: ['\\ufeff', 'T', 'h', 'e', ' ', 'P', 'r', 'o', 'j', 'c', 't', 'G', 'u', 'n', 'b', 'g', 'B', 'k', 'f', 'W', 'a', 'd', ',', 'y', 'L', 'l', 's', '\\r', '\\n', 'i', 'w', 'U', 'S', 'm', 'p', 'v', '.', 'Y', '-', 'I', ':', 'A', 'M', 'R', 'D', '2', '0', '1', '[', '#', '6', ']', 'J', '9', 'E', 'C', 'F', '8', 'V', '*', 'O', 'H', 'N', 'K', '/', '5', 'X', '7', '3', '“', '’', '—', '‘', '!', '?', '”', 'á', 'é', 'ë', 'í', ';', 'x', '(', ')', 'z', 'q', 'À', 'ó', 'ú', 'è', 'î', 'ô', 'à', 'ç', 'Q', 'â', 'ê', 'ï', 'Z', '4', 'ý', 'ö', 'ä', 'ü', 'Á', 'œ', 'É', '=', 'æ', '\"', '%', '\\'', '$']\n",
|
||
"\n",
|
||
"ASCII letters: ['T', 'h', 'e', ' ', 'P', 'r', 'o', 'j', 'c', 't', 'G', 'u', 'n', 'b', 'g', 'B', 'k', 'f', 'W', 'a', 'd', ',', 'y', 'L', 'l', 's', '\\r', '\\n', 'i', 'w', 'U', 'S', 'm', 'p', 'v', '.', 'Y', '-', 'I', ':', 'A', 'M', 'R', 'D', '2', '0', '1', '[', '#', '6', ']', 'J', '9', 'E', 'C', 'F', '8', 'V', '*', 'O', 'H', 'N', 'K', '/', '5', 'X', '7', '3', '!', '?', ';', 'x', '(', ')', 'z', 'q', 'Q', 'Z', '4', '=', '\"', '%', '\\'', '$']\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"n = length(str)\n",
|
||
"letters = unique(str)\n",
|
||
"println(\"number of letters and unique letters in W&P: \", n,\" \",length(letters))\n",
|
||
"\n",
|
||
"println(\"\\nunique letters: \",letters)\n",
|
||
"\n",
|
||
"vv = isascii.(letters)\n",
|
||
"println(\"\\nASCII letters: \",letters[vv])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"number of words in W&P, including pre-amble, etc: 566334\n",
|
||
"number of lines in W&P, including pre-amble, etc: 66033\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"words = split(str)\n",
|
||
"println(\"number of words in W&P, including pre-amble, etc: \",length(words))\n",
|
||
"\n",
|
||
"lines = split(str,\"\\n\")\n",
|
||
"println(\"number of lines in W&P, including pre-amble, etc: \",length(lines))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"How many unique words are there in the file?\n",
|
||
"Number of unique words: 41971\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"println(\"How many unique words are there in the file?\")\n",
|
||
"UniqueWords = unique(words)\n",
|
||
"\n",
|
||
"println(\"Number of unique words: \",length(UniqueWords))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Task 2\n",
|
||
"\n",
|
||
"1. How often is Borodinó mentioned. Hint: `occursin.(,words)`\n",
|
||
"\n",
|
||
"2. Print all lines that contain the word Borodinó. Hint: `occursin(,line)`"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"How often is Borodinó mentioned?\n",
|
||
"108\n",
|
||
"108\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"println(\"How often is Borodinó mentioned?\") \n",
|
||
"\n",
|
||
"println(sum(occursin.(\"Borodinó\",words))) #\"Borodinó\" or \"Borodinó. or similarly\n",
|
||
"\n",
|
||
"println(sum(z->occursin(\"Borodinó\",z),words)) #quicker approach"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"print the line numbers and the lines that contain the word Borodinó:\n",
|
||
"\n",
|
||
"38971 they reached Borodinó, seventy miles from Moscow. From Vyázma Napoleon\n",
|
||
"41375 the twenty-sixth the battle of Borodinó itself took place.\n",
|
||
"41377 Why and how were the battles of Shevárdino and Borodinó given and\n",
|
||
"41378 accepted? Why was the battle of Borodinó fought? There was not the least\n",
|
||
"41399 Before the battle of Borodinó our strength in proportion to the French\n",
|
||
"41416 In giving and accepting battle at Borodinó, Kutúzov acted involuntarily\n",
|
||
"41427 On the other question, how the battle of Borodinó and the preceding\n",
|
||
"41434 position at Borodinó.\n",
|
||
"41438 to it, from Borodinó to Utítsa, at the very place where the battle was\n",
|
||
"41445 the field of Borodinó.\n",
|
||
"41451 during the retreat passed many positions better than Borodinó. They did\n",
|
||
"41457 and that the position at Borodinó (the one where the battle was fought),\n",
|
||
"41463 Borodinó to the left of, and at a right angle to, the highroad (that\n",
|
||
"41481 later, when reports on the battle of Borodinó were written at leisure,\n",
|
||
"41486 the battle of Borodinó was fought by us on an entrenched position\n",
|
||
"41493 village of Nóvoe, and the center at Borodinó at the confluence of the\n",
|
||
"41496 To anyone who looks at the field of Borodinó without thinking of how\n",
|
||
"41503 to Borodinó (he could not have seen that position because it did not\n",
|
||
"41514 Borodinó—a plain no more advantageous as a position than any other plain\n",
|
||
"41531 and chief action of the battle of Borodinó was already lost on the\n",
|
||
"41551 distinct from the main course of the battle.) So the battle of Borodinó\n",
|
||
"41554 army and people) it has been described. The battle of Borodinó was not\n",
|
||
"41557 Shevárdino Redoubt, the Russians fought the battle of Borodinó on an\n",
|
||
"41749 hundred paces in front of the knoll and below it. This was Borodinó.\n",
|
||
"41780 “Borodinó,” the other corrected him.\n",
|
||
"41814 entrenchments. There, you see? There’s our center, at Borodinó, just\n",
|
||
"41855 A church procession was coming up the hill from Borodinó. First along\n",
|
||
"42128 Borodinó and thence turned to the left, passing an enormous number of\n",
|
||
"42133 him than any other spot on the plain of Borodinó.\n",
|
||
"42643 On August 25, the eve of the battle of Borodinó, M. de Beausset, prefect\n",
|
||
"42949 The fourth order was: The vice-King will occupy the village (Borodinó)\n",
|
||
"42957 given him, he was to advance from the left through Borodinó to the\n",
|
||
"42962 not be executed. After passing through Borodinó the vice-King was driven\n",
|
||
"42982 Many historians say that the French did not win the battle of Borodinó\n",
|
||
"42993 battle of Borodinó, and if this or that other arrangement depended on\n",
|
||
"43015 men at Borodinó was not due to Napoleon’s will, though he ordered the\n",
|
||
"43022 At the battle of Borodinó Napoleon shot at no one and killed no one.\n",
|
||
"43026 The French soldiers went to kill and be killed at the battle of Borodinó\n",
|
||
"43065 than previous ones because the battle of Borodinó was the first Napoleon\n",
|
||
"43078 Napoleon at the battle of Borodinó fulfilled his office as\n",
|
||
"43286 Pierre most of all was the view of the battlefield itself, of Borodinó\n",
|
||
"43289 Above the Kolochá, in Borodinó and on both sides of it, especially to\n",
|
||
"43297 riverbanks and in Borodinó. A white church could be seen through the\n",
|
||
"43298 mist, and here and there the roofs of huts in Borodinó as well as dense\n",
|
||
"43301 the whole space. Just as in the mist-enveloped hollow near Borodinó, so\n",
|
||
"43386 bridge across the Kolochá between Górki and Borodinó, which the French\n",
|
||
"43387 (having occupied Borodinó) were attacking in the first phase of the\n",
|
||
"43819 The chief action of the battle of Borodinó was fought within the seven\n",
|
||
"43820 thousand feet between Borodinó and Bagratión’s flèches. Beyond that\n",
|
||
"43825 battlefield. On the field between Borodinó and the flèches, beside the\n",
|
||
"43834 troops advanced on Borodinó from their left.\n",
|
||
"43838 to Borodinó, so that Napoleon could not see what was happening there,\n",
|
||
"43886 Borodinó had been occupied and the bridge over the Kolochá was in the\n",
|
||
"43890 as soon in fact as the adjutant had left Borodinó—the bridge had been\n",
|
||
"44222 times repulsed. In the center the French had not got beyond Borodinó,\n",
|
||
"44301 of the field of Borodinó.\n",
|
||
"44821 hundreds of years the peasants of Borodinó, Górki, Shevárdino, and\n",
|
||
"44903 Russians at Borodinó. The French invaders, like an infuriated animal\n",
|
||
"44909 wound it had received at Borodinó. The direct consequence of the battle\n",
|
||
"44910 of Borodinó was Napoleon’s senseless flight from Moscow, his retreat\n",
|
||
"44913 which at Borodinó for the first time the hand of an opponent of stronger\n",
|
||
"45069 from Smolénsk to Borodinó. The French army pushed on to Moscow, its\n",
|
||
"45078 consolidated. At Borodinó a collision took place. Neither army was\n",
|
||
"45097 Russian army were convinced that the battle of Borodinó was a victory.\n",
|
||
"45185 the twenty-sixth at Borodinó, and each day and hour and minute of the\n",
|
||
"45186 retreat from Borodinó to Filí.\n",
|
||
"45449 After the battle of Borodinó the abandonment and burning of Moscow was\n",
|
||
"45491 that could happen. They went away even before the battle of Borodinó and\n",
|
||
"45873 Borodinó.\n",
|
||
"45881 Toward the end of the battle of Borodinó, Pierre, having run down\n",
|
||
"46434 and commotion. Every day thousands of men wounded at Borodinó were\n",
|
||
"46442 Some said there had been another battle after Borodinó at which the\n",
|
||
"47598 to the second of September, that is from the battle of Borodinó to the\n",
|
||
"48315 ever since the battle of Borodinó, for all the generals who came to\n",
|
||
"48353 if after the battle of Borodinó, when the surrender of Moscow became\n",
|
||
"49092 particularly of the battle of Borodinó and of that vague sense of his\n",
|
||
"50087 ambulance station on the field of Borodinó. His feverish state and the\n",
|
||
"50105 Borodinó. They were accompanied by a doctor, Prince Andrew’s valet, his\n",
|
||
"50835 battle of Borodinó, there was a soiree, the chief feature of which was\n",
|
||
"51280 A few days before the battle of Borodinó, Nicholas received the\n",
|
||
"51740 The dreadful news of the battle of Borodinó, of our losses in killed and\n",
|
||
"51747 When he received the news of the battle of Borodinó and the abandonment\n",
|
||
"53596 The historians consider that, next to the battle of Borodinó and the\n",
|
||
"53703 whole campaign and by the battle of Borodinó, the Russian army—when\n",
|
||
"53711 Borodinó had been a victory, he alone—who as commander in chief might\n",
|
||
"53715 The beast wounded at Borodinó was lying where the fleeing hunter had\n",
|
||
"54244 or deliberately deceive themselves. No battle—Tarútino, Borodinó, or\n",
|
||
"54858 of Borodinó. He had sought it in philanthropy, in Freemasonry, in the\n",
|
||
"55314 day long. At the battle of Borodinó, when Bagratión was killed and nine\n",
|
||
"55319 And the quiet little Dokhtúrov rode thither, and Borodinó became the\n",
|
||
"55543 The undecided question as to whether the wound inflicted at Borodinó was\n",
|
||
"55658 That army could not recover anywhere. Since the battle of Borodinó\n",
|
||
"55805 The Battle of Borodinó, with the occupation of Moscow that followed it\n",
|
||
"55840 history: to say that the field of battle at Borodinó remained in the\n",
|
||
"55844 After the French victory at Borodinó there was no general engagement nor\n",
|
||
"55855 The period of the campaign of 1812 from the battle of Borodinó to the\n",
|
||
"55895 retreats after battles, the blow dealt at Borodinó and the renewed\n",
|
||
"57692 done at Mozháysk after the battle of Borodinó.\n",
|
||
"58038 French had given battle at Borodinó, did not achieve its purpose when it\n",
|
||
"58063 enemy in full strength at Borodinó—defeated at Krásnoe and the Berëzina\n",
|
||
"58772 activity in 1812, never once swerving by word or deed from Borodinó to\n",
|
||
"58819 Beginning with the battle of Borodinó, from which time his disagreement\n",
|
||
"58820 with those about him began, he alone said that the battle of Borodinó\n",
|
||
"58845 this enemy of decisive action, gave battle at Borodinó, investing the\n",
|
||
"58848 contradiction to everyone else, declared till his death that Borodinó\n",
|
||
"59762 Borodinó for more than a month had recently died in the Rostóvs’ house\n",
|
||
"60187 one at Borodinó.\n",
|
||
"61413 the cold in his head at Borodinó to the sparks which set Moscow on\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"println(\"print the line numbers and the lines that contain the word Borodinó:\\n\")\n",
|
||
"for (i,line) in enumerate(lines)\n",
|
||
" if occursin(\"Borodinó\",line)\n",
|
||
" println(i,\" \",line)\n",
|
||
" end\n",
|
||
"end"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Task 3\n",
|
||
"\n",
|
||
"1. Change Borodinó everywhere to Berëzina and then count the occurances"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Change Borodinó everywhere to Berëzina and then count the occurances\n",
|
||
"125\n",
|
||
"125\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"println(\"Change Borodinó everywhere to Berëzina and then count the occurances\")\n",
|
||
"str2 = replace(str,\"Borodinó\"=>\"Berëzina\");\n",
|
||
"words2 = split(str2)\n",
|
||
"println(sum(occursin.(\"Berëzina\",words2)))\n",
|
||
"\n",
|
||
"println(sum(z->occursin(\"Berëzina\",z),words2)) #quicker approach"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"@webio": {
|
||
"lastCommId": null,
|
||
"lastKernelId": null
|
||
},
|
||
"anaconda-cloud": {},
|
||
"kernel_info": {
|
||
"name": "julia-1.2"
|
||
},
|
||
"kernelspec": {
|
||
"display_name": "Julia 1.7.0",
|
||
"language": "julia",
|
||
"name": "julia-1.7"
|
||
},
|
||
"language_info": {
|
||
"file_extension": ".jl",
|
||
"mimetype": "application/julia",
|
||
"name": "julia",
|
||
"version": "1.7.0"
|
||
},
|
||
"nteract": {
|
||
"version": "0.24.1"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|