Episode Transcripts

Jonah

Member
I would like to follow up on DolorousStroke's suggestion about transcribing episodes. It seems that was suggested on both the "Thought on Organization" thread and the "Welcome, Dwarven Smiths" thread. I thought we might move that conversation to it's own thread.
 
Last edited:
It was pointed out in the "Welcome, Dwarven Smiths" thread that YouTube transcribes the episodes (thought not 1-15). I checked that out and here is a sample from Episode 16:

05:10
alright so that's tomorrow night and
05:13
tonight we're digging back into chapter
05:16
4 so like 8 years ago when we last had
05:20
class we left the hub least you have to
05:25
acknowledge that we we left our
05:28
protagonists in a pretty good spot right
05:31
we left Frodo and Sam and Pippin getting
05:35
rather tipsy on elf liquor sitting under
05:39
a tree on the edge of the barracks
05:42
so they've been having a jolly time for
05:44
the last three weeks while we've been
05:47
going about our business and we're going
05:48
to pick up with them tonight what I want
05:51
to be really looking at is I want to be
05:52
thinking about the way in which there
05:56
are a lot of juxtapositions I think in
05:57
Chapter four you know what
05:59
juxtaposition is a really fun word which
06:02
I enjoy saying whenever I can that is
06:04
you know when when when things are kind
06:06
of put next to each other in a way that
06:09
makes us really think about both of them
06:10
differently the contrast between the

My initial attempt at turning that one minute of transcript into legible paragraphs:

(05:10) Alright, so that's tomorrow night, and (05:13) tonight we're digging back into Chapter (05:16) 4. So, like 8 years ago when we last had (05:20) class, we left the hobbits – at least you have to (05:25) acknowledge that we left our (05:28) protagonists in a pretty good spot, right? (05:31) We left Frodo and Sam and Pippin getting (05:35) rather tipsy on elf liquor, sitting under (05:39) a tree on the edge of the marish. (05:42) So they've been having a jolly time for (05:44 ) the last three weeks while we've been (05:47) going about our business, and we're going (05:48) to pick up with them tonight.

What I want (05:51) to be really looking at is: I want to be (05:52) thinking about the way in which there (05:56) are a lot of juxtapositions, I think, in (05:57) Chapter 4. (05:59) ‘Juxtaposition’ is a really fun word, which (06:02) I enjoy saying whenever I can. That is, (06:04) when things are kind (06:06) of put next to each other in a way that (06:09) makes us really think about both of them (06:10) differently.
 
Last edited:
Thanks Jonah,

We could probably remove some of the timecodes from within each of the paragraphs. I think a timecode at the beginning of each short paragraph or change of topic, making sure there is one for every minute or two, would be sufficient.

The YouTube transcripts would be a good place to start, and from there volunteers can proofread and condense the transcripts according to a style guide we can develop. For those videos without transcripts, some speedy typists could volunteer to do a first draft, then these can be passed on to a new set of eyes (or ears) to proofread as with the YouTube transcripts.

I have some very basic experience transcribing video interviews from my film course, but nothing professional. @NancyL has done work with Project Gutenburg, and has made some great points about process on the Thought on Organisation thread. Perhaps we could start working on creating a manual for transcribers and proofreaders to use?
 
Thanks Smaug.

I agree about the time stamps.

It seems like this might be a two-step process: (1) Produce draft transcripts from the YouTube transcripts or, for Episodes 1-15, directly from the audio; (2) Edit draft transcripts according to the style guild we are developing.

To begin step 1, we would not need to wait for the style guide to be developed but we would need some guidelines. Turning the YouTube transcript into sentences and paragraphs requires some judgment calls. Here are a few I encountered in the one-minute sample above:

1. What to delete? I deleted what seemed like 'filler speech': "you know what" at 5:57, and "you know" at 6:04. I also deleted the repeated "when when" at 6:04. Should we leave those things? Should we delete anything else?

2. What to correct? I only corrected what seemed like obvious transcribing errors: "hub" to "hobbits" at 5:20, and "barracks" to "marish" at 5:39.

3. Punctuation? How to punctuate when Corey makes a side comment? For example, in my text should the sentence starting at 5:59, "'Juxtaposition' is a really fun word, which I enjoy saying whenever I can." be in parentheses? Also, how should we punctuate when Corey interrupts himself? For example, between 5:20 and 5:25 I used a dash: "we left the hobbits – at least you have to".

4. Time stamps? I left all the YouTube time stamps in. But I agree with you that it would be better to leave only the time stamps at the beginning of paragraphs and every minute in the case of long paragraphs.

My answers to these questions: I suggest we make only the obvious kinds of deletions and corrections that I made in the above sample. I don't think we should worry too much about punctuation, since the style editors will be concerned with that in step 2. I think we should just try to make our texts legible. And I already gave my opinion about time stamps.
 
Last edited:
Transcribing directly from the audio would be possible, though tedious.

There is transcription software. But I think we would need the original audio files for that. Maybe Corey or another Signum insider could give us access to those?
 
I used to transcribe lectures from my seminary so I have plenty of experience with it. There is software that will subscribe audio but it's usually filled with errors and is not free after a certain amount. I agree that it is a tedious process but I think necessary if we want accuracy
 
I used to transcribe lectures from my seminary so I have plenty of experience with it. There is software that will subscribe audio but it's usually filled with errors and is not free after a certain amount. I agree that it is a tedious process but I think necessary if we want accuracy

I am willing to help! We could create an episode list in a Doc for volunteers to sign up to transcribe chosen episodes. I would hate for someone to transcribe an episode that another person is already working on.
 
I am willing to help! We could create an episode list in a Doc for volunteers to sign up to transcribe chosen episodes. I would hate for someone to transcribe an episode that another person is already working on.
I think creating a (continually growing) list of Episodes to be (1) transcribed and (2) edited, is an excellent idea. Would you be able to do that, Lashley? I will take Episode 16, since I already started it. I will probably be able to take more, but I'll hold off on further commitment until I see how long it takes to do one.
 
Thanks for making the Google Doc @Lashley66!

If we want to make a style guide for transcription and transcription editing, I think it might be good to let it happen organically. If each of us who are interested in transcription would like to work on one or two episodes to start with, we can then share our results with each other and compare methods. From there we can decide on some very basic guidelines for future transcription, and possibly upload an example finished transcript for those interested to look at.

I'll sign myself up for editing a youtube transcript to work on in between keeping an eye on all of these forums.
 
Episode 16 Transcript Draft is completed.

I did not include the introductory announcements or the field trip at the end. The transcription will need some editing. I tried to transcribe the audio into legible, mostly grammatical paragraphs. But I didn't really try to be consistent about punctuation or other elements of style. I only guessed at the spelling of the user names of the people to whom Corey responds. Also, I didn't go through my finished text carefully, so I'm sure there are typos and other mistakes that a copy editor would find.

Some remarks about using the YouTube transcript:

We probably shouldn't think of the YouTube transcripts as things that can be edited. I'm a pretty slow typist. I can't come close to keeping up with the audio, even when I slow it down. Still, I found that typing the transcription was much more efficient than copying the YouTube text and editing it. The errors in the YouTube transcription, the lack of punctuation, and the small amount of text between timestamps made my attempts to edit the YouTube text much slower than typing.

I did, however, find the YouTube transcript very helpful to look at while I was typing. I found it helpful for inserting timestamps, remembering the audio content I was transcribing, and navigating to parts of the audio I wanted to hear again.

I will proceed to Episode 17. I think it better to leave Episodes 1-15 to better typists who can keep up with the audio better than I can. I need the crutch of the YouTube transcript.
 
I've finished a draft of Episode 20.
I actually found it quicker to use the Youtube transcript as a basis, deleting the surplus timestamps and making editorial changes, rather than re-typing all of the material, although I did have to re-type in some areas where the Youtube mistakes were particularly bad. I think it will just be personal preference whether people want to edit the Youtube ones or type it themselves.

A few things I thought of when I was transcribing:
1. I think we should use this opportunity of going through the episodes to timestamp Tony's summaries, as this may come in handy. Each time an episode transcript is finished, it would be great to just scan through, find Tony's main points in the discussion, and add a starting time code for each major point. I'm thinking just the main topics, not the dot points underneath.
2. Corey reads through the text in the transcript, and if we upload all of the text readings within the transcripts, this could be a copyright infringement (unlikely as it is that anyone would go through the transcripts to create an ebook). I'm not 100% sure on this, but before we actually publish the transcripts to the website (once it has been made) I think we should decide whether to include the text readings, and if not, decide how to demonstrate the text reading in the transcript. In my transcript, for each reading I wrote something like - Text reading: “Good heavens!” to “escape so easily!”. Alternatively, we could use a citation system (currently in progress) that we will implement anyway for other locations within the website.
3. I don't think the wiki page for each episode should be based on the transcript, as they are really quite long. I would opt for having each episode page based on Tony's summary, which contains most of the points discussed, and then have the transcript available for viewing and/or download for those that would like it. We could then link each of Tony's points to a page which explores each point in more detail, and contains further links to related discussions.

I'll continue doing some transcribing in between other tasks.
 
So, should I not be doing the episode summaries anymore?
I would be grateful, Tony, if you would continue to post your episode summaries, and I certainly haven't meant to suggest otherwise. I think them very well done and an essential part of this project going forward.
 
Okay, I just wanted to make sure that the work I had done previously wasn't being duplicated. My biggest fear all along has been that I had put three years of work into those and they would end up not being used.
 
Okay, I just wanted to make sure that the work I had done previously wasn't being duplicated. My biggest fear all along has been that I had put three years of work into those and they would end up not being used.
Your episode summaries have been very useful to me in identifying the themes I have tried to track episode by episode. And, as I understand it, the episode transcripts we have started drafting are meant to compliment and (eventually) link to your episode summaries.

I share your concern about your work being useful. I have been frustrated by how this project seems to have stalled. For my part, I will try to persevere.
 
@Tony Meade If you are willing, please definitely continue with the summaries. Thank you so much for all the work you have been doing on them, I don't think this project would be the same without them!

I understand your concern given we have been starting on episode transcripts, but your summaries will form the major basis for the wiki as I see it. They are an excellent overview of what was discussed in each episode, and they will be instrumental in creating a pathway for people to catch up with the series from now on, reducing the barrier to entry.

The transcripts are being created specifically for those who are interested in:
a) reading or searching the text of the sessions instead of listening;
b) delving into more detail than your summaries provide;
c) citing the exact wording of a discussion point, as Corey often words the points very well and we want to preserve that.

I share your concern about your work being useful. I have been frustrated by how this project seems to have stalled. For my part, I will try to persevere.
I know what you mean Jonah, thanks for continuing! I'm working behind the scenes on how to build up momentum again, stay tuned!
 
Back
Top