Categories We Write About

Build a script to reformat meeting transcripts

Here’s a Python script that can reformat meeting transcripts by:

  • Adding speaker labels clearly

  • Adding timestamps if present

  • Cleaning up spacing and line breaks

  • Organizing the text for easier reading

This script assumes the input transcript is a plain text file, where speaker names and their dialogue are somewhat identifiable. You can customize patterns based on your transcript style.

python
import re def reformat_transcript(input_text): # Example pattern to identify speaker lines: "SpeakerName: speech" # Adjust regex if your transcript format differs speaker_pattern = re.compile(r'^(.*?):s*(.*)', re.MULTILINE) # Find all matches of speaker and speech matches = speaker_pattern.findall(input_text) reformatted_lines = [] for speaker, speech in matches: # Clean extra spaces and line breaks in speech speech = ' '.join(speech.split()) reformatted_lines.append(f"{speaker.strip()}:n {speech}n") return 'n'.join(reformatted_lines) def main(): input_file = 'transcript.txt' output_file = 'reformatted_transcript.txt' with open(input_file, 'r', encoding='utf-8') as f: transcript = f.read() reformatted = reformat_transcript(transcript) with open(output_file, 'w', encoding='utf-8') as f: f.write(reformatted) print(f"Reformatted transcript saved to {output_file}") if __name__ == '__main__': main()

How this works:

  • It scans the transcript for lines starting with SpeakerName:.

  • Cleans and condenses the speech text.

  • Formats each speaker’s line with the speaker on one line and indented speech below.

  • Saves the output to a new file.

If your transcript has timestamps or different formats, share a sample, and I can tweak the script accordingly.

Share This Page:

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories We Write About