bi3mw > 26-08-2024, 08:34 PM
(25-08-2024, 06:11 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.Original chars -> replaced with
ch -> C
sh -> S
cth -> T
ckh -> K
cph -> P
cfh -> F
RobGea > 26-08-2024, 10:45 PM
bi3mw > 27-08-2024, 03:32 PM
RobGea > 27-08-2024, 09:41 PM
ReneZ > 27-08-2024, 10:57 PM
bi3mw > 28-08-2024, 01:28 PM
(27-08-2024, 10:57 PM)ReneZ Wrote: You are not allowed to view links. Register or Login to view.You should be able to get identical results. That is not an unachievable ideal. It should be possible.
#!/bin/bash
# Check if the correct number of arguments has been passed
if [ "$#" -ne 2 ]; then
echo "Usage: $0 <inputfile> <outputfile>"
exit 1
fi
# Input file and output file from the arguments
inputfile="$1"
outputfile="$2"
# Perform character replacements and write to the output file, remove empty lines
sed -e 's/@[^;]*;/w/g' \
-e '/^$/d' "$inputfile" > "$outputfile"
# Print success message
echo "The file has been successfully written to $outputfile."
import sys
from collections import Counter
import re
# Function to calculate digram (bigram) frequency within words
def calculate_digram_frequency(filepath):
# Read file content
with open(filepath, 'r', encoding='utf-8') as file:
text = file.read().lower()
# Replace non-letter characters with spaces, except within words
text = re.sub(r'[^a-z\s]', ' ', text)
# Split text into words based on spaces
words = text.split()
# List to store bigrams
bigrams = []
# Generate bigrams within each word
for word in words:
if len(word) >= 2:
bigrams.extend(word[i:i+2] for i in range(len(word) - 1))
# Count the occurrences of each bigram
counter = Counter(bigrams)
return counter, len(bigrams)
# Main program
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python program.py <input_filepath> <output_filepath>")
sys.exit(1)
input_filepath = sys.argv[1] # Input file as an argument
output_filepath = sys.argv[2] # Output file as an argument
# Calculate frequencies
counter, total_bigram_count = calculate_digram_frequency(input_filepath)
# Sort bigrams by frequency and select the top 12
top_bigrams = counter.most_common(12)
# Write the top 12 bigrams to the output file
with open(output_filepath, 'w', encoding='utf-8') as output_file:
output_file.write(" No. Bigram Frequency (in %) Frequency\n")
for i, (bigram, frequency) in enumerate(top_bigrams, 1):
bigram_upper = bigram.upper() # Convert bigram to uppercase
frequency_percentage = (frequency / total_bigram_count) * 100 if total_bigram_count > 0 else 0.0
output_file.write(f"{i:>4} {bigram_upper:>8} {frequency_percentage:>13.4f} {frequency:>10}\n")
nablator > 28-08-2024, 02:17 PM
bi3mw > 28-08-2024, 03:03 PM
RobGea > 28-08-2024, 06:04 PM
bi3mw > 28-08-2024, 06:25 PM
(28-08-2024, 06:04 PM)RobGea Wrote: You are not allowed to view links. Register or Login to view.Wrong Cryptool counts are totally my bad , i just grabbed a file that had @nnn as high ascii bytes and cryptool has a default charset of a-z, which i forgot to account for. Totally my fault , sorry for the confusion![]()