At Twitter, #HackWeek is a time when the company comes together to dream, design, and build new ways to make Twitter better. #HackWeek gives our team the time and space to imagine what might be, to find solutions for problems, and to practice innovation through experimentation.
Broadcasting with audio only in Periscope is something the community has been asking for, and have been doing already by covering the camera lens. Sometimes people are not comfortable being on camera, but they still want to broadcast and interact with others via Periscope’s powerful chatroom feature. It was requested that we build this feature for #HackWeek. Initially we thought the project might be too big in scope for #HackWeek, but after some research and focusing on the core of the problem we realized we could get it done in three days.
The most obvious solution is quite literally streaming just audio. However, the scope that would arise from this path would involve multiple teams (all the front end clients and the backend infrastructure) and potentially require a few months to complete. Since #HackWeek is actually four days of development and one day to demo, time is very tight. This makes it important to focus on the most important aspects.
The goal for audio-only broadcasting is to give people a way to broadcast without having to show themselves or their surroundings. The goal is not engineering support for another media format. When framed that way, I realized that all I needed to do is to render a video of what the audio looks like, and the video would take the place of camera pixels. This approach has many advantages: it’s backwards compatible meaning it plays everywhere. Also, it’s easier to integrate into the existing broadcasting pipeline, and it’s something we could build with minimal dependencies.
The final solution is elegantly simple, it’s basically a minor to change to the data flow within the Periscope App. The iPhone microphone produces the audio stream and we continue to use that as-is. The camera produces a video stream which we immediately discard, and instead we create dynamic video animations informed by the audio data that we render on the iPhone in real-time. The audio stream is sent to the audio visualizer renderer which takes the raw audio bits and generates the waveform and volume levels indicator. This new stream is sent to backend instead of the one from the camera.
In summary, it’s important to remember in software development there’s never just one correct answer to a problem. It’s sometimes easy to get lost in the details and over-engineer a solution and miss the point of the feature you are working on. By focusing on the end result, we were able to find a golden path to solve a problem in a short amount of time, and give people a new feature to enjoy on Periscope that they have been requesting. The ultimate goal is delighting people who use our service and serving the public conversation.
— Richard Plom, Staff Engineer