Skip to main content

· 3 min read
Roy Firestein

As a salesperson, building trust with your prospects is crucial for closing deals and establishing long-term relationships. But how can you create that trust quickly and effectively? One approach is to take a cue from the medical profession and treat your sales conversations more like a doctor's visit.

When you go to the doctor, they don't just sit down and start prescribing medication or treatments. Instead, they ask probing questions to diagnose your condition and understand your unique needs. They want to get to the root of the problem before offering any solutions.

As a salesperson, you should adopt a similar mindset. Rather than launching straight into your pitch, focus on asking insightful questions that uncover your prospect's true challenges, goals, and motivations. This shows you care about understanding their situation, not just making a sale.

Doctors also aren't afraid to ask tough, even uncomfortable questions when needed. They know that getting the full picture is essential for making an accurate diagnosis and recommendation.

Likewise, don't shy away from probing deeper with prospects when you sense there's more to the story. Asking challenging questions demonstrates confidence and expertise. It positions you as a trusted advisor, not just another vendor. Prospects will respect your thoroughness.

Just like a good doctor, aim to prescribe solutions that are tailored to each prospect's specific "symptoms" or needs. Cookie-cutter approaches erode trust. Prospects want to feel you've carefully considered their unique situation.

Finally, remember that patients don't necessarily expect their doctor to be their best friend. While bedside manner is important, it's secondary to being a competent professional who asks the right questions and provides appropriate guidance.

The same is true in sales. Focus first on demonstrating credibility, insight and leadership through great questions and attentive listening. Build trust by showing prospects you're a safe pair of hands to guide them to the right solution.

Conclusion

To create trust like a doctor, salespeople should:

  1. Ask tough, challenging questions: Doctors ask patients uncomfortable questions to diagnose the problem accurately. Similarly, salespeople should ask prospects questions that they may not know the answer to, but feel the salesperson should. This demonstrates expertise and helps the salesperson understand the prospect's true needs and motivations.
  2. Focus on the process, not the outcome: Doctors follow a systematic process to diagnose and treat patients. Salespeople should also have a structured approach to selling, focusing on the steps that lead to a successful outcome rather than being overly attached to the final result.
  3. Provide leadership, a safe pair of hands, and insight: Prospects look for these three qualities in a salesperson. By demonstrating leadership, competence, and valuable insights, salespeople can establish trust and credibility, much like a doctor does with their patients.
  4. Avoid being needy or overly adaptive: Doctors don't change their approach to please patients; they do what is necessary to help them. Salespeople should similarly maintain their professionalism and not be overly concerned with being liked or adapting to every whim of the prospect.
  5. Be willing to challenge the prospect: Doctors often have to convince patients to make difficult lifestyle changes for their health. Salespeople should also be prepared to challenge their prospects' assumptions and guide them towards the best solution, even if it may be uncomfortable.

By adopting these doctor-like behaviors, salespeople can shift their focus from being liked to being trusted. This approach may not always be comfortable, but it is more likely to lead to successful long-term relationships with clients who view the salesperson as a credible expert and partner in achieving their goals.

· 3 min read
Roy Firestein

A new study by researchers at Cornell Tech and Intuit has revealed a concerning threat emerging from the rapid adoption of Generative AI (GenAI) capabilities into interconnected ecosystems of semi/fully autonomous agents. The paper introduces Morris II, the first worm designed to target these GenAI ecosystems through the use of adversarial self-replicating prompts.

· 14 min read
Roy Firestein

Are you a technical founder struggling to crack the code on outbound sales? Do you find yourself more comfortable debugging code than dialing for dollars? Fear not, intrepid entrepreneur, for I have the blueprint to take you from zero to sales hero. In this post, I'll share the hard-won lessons I learned while building a high-performing business development representative (BDR) team from scratch. Get ready to level up your sales game and watch your startup soar to new heights.

· 3 min read
Roy Firestein

As a Business Development Representative (BDR), your success hinges on your ability to connect with prospects and uncover their pain points. While your script and product knowledge are essential, there's one often-overlooked skill that can make or break your conversations: your tone.

· 9 min read
Roy Firestein

How hard is it to build an AI scammer or a frontdesk assistant? Not hard at all.

AI research is progressing at a breakneck pace thanks to the large investments in the field over the last decade and increasing computational power. The demand for AI has exceeded initial expectations, with businesses and individuals alike relying on AI to make their daily tasks more efficient. New companies are emerging to capture business opportunities in the AI space. One such company is Groq, which is developing a new AI inference accelerator. Groq promises to offer the fastest (and cheapest) AI inference price per 1M tokens.

Now, let's talk about chatting with AI in real-time. It's not as simple as it sounds. Imagine having a conversation where every reply comes after an awkward pause—it wouldn't be fun, or believable, right? For AI to keep up in a real chit-chat, it needs to snap back with answers in less than a blink. That's under 500 milliseconds, to be exact. Groq's hardware, combined with the right AI models, makes this kind of speedy banter possible.

Think about all the cool stuff we can do with real-time voice chat AI. It's not just about asking your phone for the weather; it's about revolutionizing customer service, creating new ways to interact with technology hands-free, and offering a helping hand to those in need. But let's not sugarcoat it—there's a flip side. Just as we can use AI for good, some will try to use it for scams and other shady stuff. In this post, we're taking a deep dive into the world of real-time AI voice chats, showing you the good, the bad, and how to get your hands dirty building it.

You can skip to the Demo Videos section to see it all working together.

Requirements

This project will require the following:

  • A MacBook M2 with minimum 16GB RAM
  • Free API key from Groq
  • Free API key from Eleven Labs
  • Python 3.10 and other software dependencies

Lab Setup

We begin with installing the required software.

Install hombebrew on your Mac if you haven't already.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Using brew, install python@3.10, portaudio, and ffmpeg.

brew install python@3.10 portaudio ffmpeg
  • portaudio is required for the pyaudio package.
  • ffmpeg is required for the pydub package.

Now we can create the virtual environment and install the required Python packages. Virtual environments are used to isolate the dependencies of a project from the system's Python installation.

python3.10 -m venv venv

Activate the virtual environment.

source venv/bin/activate

Great. Now we can download whisper.cpp which is a local speech-to-text model. This model is a port of OpenAI's Whisper model in C/C++. Using this model locally will improve the response time of the AI. Also, this specific model is optimized to run using Apple's CoreML framework to take advantage of the M2's neural engine.

Install the latest version from source:

pip install git+https://github.com/aarnphm/whispercpp.git -vv

Create a new file called chat.py and add the following code to import the required packages.

import os
import wave
from pydub import AudioSegment
from groq import Groq
from whispercpp import Whisper
from elevenlabs import generate, stream
import pyaudio

Define the required API keys:

# Set the API keys
os.environ["ELEVEN_API_KEY"] = "YOUR API KEY"
os.environ["GROQ_API_KEY"] = "YOUR API KEY"

Download and initialize the Whisper model:

# Initialize the Whisper client
whisper = Whisper('tiny')

We are using Groq instead of OpenAI because it is faster. Create the Groq client:

# Create API clients
groq_client = Groq(
api_key=os.environ.get("GROQ_API_KEY"),
)

Define the system prompt:

# Set the system prompt
SYSTEM_PROMPT = "\n".join([
"You are a friendly hotel frontdesk agent. You are here to help guests with their problems.",
"Your responses must be very short. All of your responses must be coversational as if speaking to someone.",
"Check-in is available after 3 PM, and check out is at 11 the next day."
])

Create the output folder for audio files:

# Output directory
output_dir = 'output'
os.makedirs(output_dir, exist_ok=True)

Create a helper function to play the AI speech:

def play_speech(prompt):
audio_stream = generate(
text=prompt,
model="eleven_multilingual_v2",
voice="Rachel",
stream=True,
)
stream(audio_stream)

Create a function to generate LLM responses using Groq:

def llm_chat(user_input, chat_history, bot_name):

# Add the user input to the chat history
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
*chat_history,
{"role": "user", "content": user_input}
]

# Create the chat completion
chat_completion = groq_client.chat.completions.create(
messages=messages,
model="mixtral-8x7b-32768"
)

# Extract the LLM response
response = chat_completion.choices[0].message.content
print(f"{bot_name}: {response}")

return response

Create a function to transcribe the user's speech using Whisper:

def transcribe_audio(audio_file):

# Transcribe the audio
result = whisper.transcribe(audio_file)

# Extract the transcription
texts = whisper.extract_text(result)

# Remove empty spaces and return as a single string
return " ".join([text.lower() for text in texts if text.strip()])

Create a function to record the user's speech:

def record_audio(file_path):

# Initialize the PyAudio object
p = pyaudio.PyAudio()

# Set the audio parameters
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
CHUNK = 512
RECORD_SECONDS = 5

# Create the audio stream
stream = p.open(
format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK
)

# Empty list to store the audio frames
frames = []

print("Recording...")

# Record the audio
try:
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
except KeyboardInterrupt:
pass
except Exception as e:
print(f"Error while recording: {e}")
raise e

print("Recording complete.")

# Close the stream
stream.stop_stream()
stream.close()
p.terminate()

# Modify the audio file
wf = wave.open(file_path, 'wb')
wf.setnchannels(1)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

Create the main function to run the chat:

def converse():
audio_file = "recording.wav"
chat_history = []

play_speech("Hello, welcome to SkyLounge Hotel. How can I help you today?")

while True:

# Record the user's audio
record_audio(audio_file)

# Transcribe the user's audio
user_speech = transcribe_audio(audio_file)

# # Delete the temp audio file
os.remove(audio_file)

# Exit the chat if the user says "exit"
if user_speech.lower() == "exit":
break

# Add the user's speech to the chat history
chat_history.append({"role": "user", "content": user_speech})
print(f"You: {user_speech}")

# Send the user's speech to the LLM
bot_response = llm_chat(user_speech, chat_history, "Bot")

# Append the LLM response to the chat history
chat_history.append({"role": "assistant", "content": bot_response})

# Play the LLM response using text-to-speech
play_speech(bot_response)

# Remove old chats from the chat history
if len(chat_history) > 20:
chat_history = chat_history[-20:]


if __name__ == "__main__":
converse()

And that's it! You can now run the chat.py file to start the real-time voice chat with the AI.

python chat.py
You can find the full script here.
import os
import wave
from pydub import AudioSegment
from groq import Groq
from whispercpp import Whisper
from elevenlabs import generate, stream
import pyaudio


# Initialize the Whisper client
whisper = Whisper('tiny')


# Set the API keys
os.environ["ELEVEN_API_KEY"] = "YOUR API KEY"
os.environ["GROQ_API_KEY"] = "YOUR API KEY"


# Create API clients
groq_client = Groq(
api_key=os.environ.get("GROQ_API_KEY"),
)


# Set the system prompt
SYSTEM_PROMPT = "\n".join([
"You are a friendly hotel frontdesk agent. You are here to help guests with their problems.",
"Your responses must be very short. All of your responses must be coversational as if speaking to someone.",
"Check-in is available after 3 PM, and check out is at 11 the next day."
])


# Output directory
output_dir = 'output'
os.makedirs(output_dir, exist_ok=True)


def play_speech(prompt):
audio_stream = generate(
text=prompt,
model="eleven_multilingual_v2",
voice="Rachel",
stream=True,
)
stream(audio_stream)


def llm_chat(user_input, chat_history, bot_name):

# Add the user input to the chat history
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
*chat_history,
{"role": "user", "content": user_input}
]

# Create the chat completion
chat_completion = groq_client.chat.completions.create(
messages=messages,
model="mixtral-8x7b-32768"
)

# Extract the LLM response
response = chat_completion.choices[0].message.content
print(f"{bot_name}: {response}")

return response


def transcribe_audio(audio_file):

# Transcribe the audio
result = whisper.transcribe(audio_file)

# Extract the transcription
texts = whisper.extract_text(result)

return " ".join([text.lower() for text in texts if text.strip()])


def record_audio(file_path):
p = pyaudio.PyAudio()

FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
CHUNK = 512
RECORD_SECONDS = 5

stream = p.open(
format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK
)
frames = []

print("Recording...")

try:
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
except KeyboardInterrupt:
pass
except Exception as e:
print(f"Error while recording: {e}")
raise e

print("Recording complete.")

# Close the stream
stream.stop_stream()
stream.close()
p.terminate()

# Modify the audio file
wf = wave.open(file_path, 'wb')
wf.setnchannels(1)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

def converse():
audio_file = "recording.wav"
chat_history = []

play_speech("Hello, welcome to SkyLounge Hotel. How can I help you today?")

while True:

# Record the user's audio
record_audio(audio_file)

# Transcribe the user's audio
user_speech = transcribe_audio(audio_file)

# # Delete the temp audio file
os.remove(audio_file)

if user_speech.lower() == "exit":
break

# Add the user's speech to the chat history
chat_history.append({"role": "user", "content": user_speech})
print(f"You: {user_speech}")

# Send the user's speech to the LLM
bot_response = llm_chat(user_speech, chat_history, "Bot")

# Append the LLM response to the chat history
chat_history.append({"role": "assistant", "content": bot_response})

# Play the LLM response using text-to-speech
play_speech(bot_response)

# Remove old chats from the chat history
if len(chat_history) > 20:
chat_history = chat_history[-20:]


if __name__ == "__main__":
converse()

Demo Videos

Hotel frontdesk demo:

LLM chat demo as hotel staff

Bank scam demo:

LLM chat bank scam demo

Conclusions

It is trivial to build a real-time voice chat with AI using the latest hardware and software. The bad guys are already experimenting and using AI in their campaigns. But this technology is not going away. In fact, it will only become more prevalent. I'm betting on Apple to take the lead soon with personal assistant AI that can chat in real-time, and live on your device. AI models are becoming more efficient, and mobile devices are being redesigned to include more neural engines to power these models.

Also, check out these related projects if you want to run text-to-speech (TTS) and speech-to-text (STT) locally: