If it's just 2 sentences, then it might be small enough to store in the database. You can use the JavaScript microphone API to record the sound, and base64 encode the data and store it in a LongStringField. But if you need to record large amounts of audio or from many participants, it could quickly overload the database, so you would need to embed some widget that uploads the data to a 3rd party server (since you can't upload files to oTree).