This topic describes how to integrate the ApsaraVideo Real-time Communication (ARTC) SDK into a Linux Go project. You will build a simple real-time audio and video application. This solution supports server-side scenarios such as video conferencing, interactive streaming, and cloud recording.
Feature description
Before you begin, understand the following key concepts:
ARTC SDK: An SDK provided by Alibaba Cloud that helps developers quickly implement real-time audio and video interaction.
Global Realtime Transport Network (GRTN): A globally distributed network engineered for real-time media, ensuring ultra-low latency, high-quality, and secure communication.
Channel: A virtual room that users join to communicate with each other. All users in the same channel can interact in real time.
Host: A user who can publish audio and video streams in a channel and subscribe to streams published by other hosts.
Viewer: A user who can subscribe to audio and video streams in a channel but cannot publish their own.
Basic process for implementing real-time audio and video interaction:
Call
setChannelProfileto set the scenario, and calljoinChannelto join a channel:Video call scenario: All users are hosts and can both publish and subscribe to streams.
Interactive streaming scenario: Roles must be set using
setClientRolebefore joining a channel. For users who will publish streams, set the role to host. If a user only needs to subscribe to streams, set the role to viewer.
After joining the channel, users have different publishing and subscribing behaviors based on their roles:
All users can receive audio and video streams within that channel.
A host can publish audio and video streams in the channel.
If a viewer wants to publish streams, call the
setClientRolemethod to switch the role to host.
Sample project
The SDK package includes a sample program:
Sample | File | Description |
Feature demo |
| Demonstrates full stream publishing and subscription, token generation, and more |
To run the sample:
cd Go
go run demo.goPrerequisites
Operating system: Linux (kernel 2.6 or later)
Go version: Go 1.16 or later
Network: A stable Internet connection
Application setup: Obtain the AppID and AppKey for your ApsaraVideo Real-time Communication application. For details, see Create Application
Implementation steps
Step 1: Import the SDK
The SDK package directory structure is as follows:
AliRTCSDK_Linux/
└── Go/
├── alirtc/
│ ├── AliRTCEngine.go # Go interface wrapper
│ ├── AliRTCEngineImpl.go # Interface implementation
│ ├── AliRTCLinuxSdkDefine.go # Data structures and enumerations
│ └── lib/
│ ├── AliRtcCoreService # Background service process (absolute path required)
│ └── libAliRtcLinuxEngine.so # Full dynamic library of the SDK
├── demo.go # Feature demo sample
└── go.mod
In your Go file, import the SDK:
import "alirtc"Configure Go Modules (if module errors occur):
cd alirtc && go mod init alirtc && cd ..
go mod init your_project
go mod edit -replace alirtc=./alirtc
go mod tidy
Before running, set the dynamic library search path:
export LD_LIBRARY_PATH=/path/to/Go/alirtc/lib:$LD_LIBRARY_PATH
Step 2: Implement the event handler struct
Implement the alirtc.EngineEventHandlerInterface interface to receive notifications from the SDK.
package main
import (
"fmt"
"alirtc"
)
type VideoCallEventHandler struct{}
func (e *VideoCallEventHandler) OnJoinChannelResult(result int, channel, userId string) {
if result == 0 {
fmt.Printf("[OnJoinChannelResult] User %s joined channel %s successfully\n", userId, channel)
} else {
fmt.Printf("[OnJoinChannelResult] Failed to join, error: %d\n", result)
}
}
func (e *VideoCallEventHandler) OnRemoteUserOnLineNotify(uid string) {
fmt.Printf("[OnRemoteUserOnLineNotify] uid: %s\n", uid)
}
func (e *VideoCallEventHandler) OnRemoteUserOffLineNotify(uid string) {
fmt.Printf("[OnRemoteUserOffLineNotify] uid: %s\n", uid)
}
func (e *VideoCallEventHandler) OnRemoteTrackAvailableNotify(uid string,
audioTrack alirtc.AudioTrack, videoTrack alirtc.VideoTrack) {
fmt.Printf("[OnRemoteTrackAvailableNotify] uid: %s, audio: %d, video: %d\n",
uid, audioTrack, videoTrack)
}
// OnSubscribeMixAudioFrame: Receive mixed PCM audio frames from remote users
// Triggered when SubscribeAudioFormat = AudioFormatMixedPcm in the subscription configuration
func (e *VideoCallEventHandler) OnSubscribeMixAudioFrame(frame alirtc.AudioFrame) {
// frame.Pcm.Data PCM data ([ ]byte, int16_t format)
// frame.Pcm.Channels Number of sound channels
// frame.Pcm.SampleRates Sample rate
// Write to file, send to audio device, or decode and play here
}
// OnSubscribeAudioFrame: Receive unmixed PCM frames from individual remote users, identified by uid
// Triggered when SubscribeAudioFormat = AudioFormatPcmBeforMixing in the subscription configuration
func (e *VideoCallEventHandler) OnSubscribeAudioFrame(uid string, frame alirtc.AudioFrame) {
// uid identifies which remote user sent this frame
// Process audio data per user here
}
// OnRemoteVideoSample: Receive remote video frames
// uid identifies which remote user sent this video stream
func (e *VideoCallEventHandler) OnRemoteVideoSample(uid string, frame alirtc.VideoFrame) {
// uid identifies which remote user sent this frame
// frame.Frame.Width / frame.Frame.Height resolution
// Write to file, send to renderer, or send to video decoder here
}
func (e *VideoCallEventHandler) OnError(errorCode alirtc.ErrorCode) {
fmt.Printf("[OnError] error_code: 0x%X\n", errorCode)
}
Step 3: Authenticate with a token
The Go SDK includes the GenerateToken method. You can generate a single-parameter join token directly on the client.
Token generation flow:
Concatenate the string:
appId + appKey + channelId + userId + nonce + timestampCompute the SHA-256 hash and obtain a hexadecimal string
Build JSON:
{"appid":..., "channelid":..., "userid":..., "nonce":..., "timestamp":..., "token":<sha256>}Base64-encoded
import "time"
authInfo := alirtc.AuthInfo{
AppID: "your_app_id",
Channel: "your_channel_id",
UserID: "your_user_id",
UserName: "your_user_id",
}
expire := time.Now().Add(24 * time.Hour)
authInfo.Timestamp = int64(expire.Unix()) // Expires in 24 hours
appKey := "your_app_key" // Never expose AppKey in client code in production
// Call GenerateToken to generate a single-parameter join token (engine instance must be created first)
authInfo.Token = linuxEngine.GenerateToken(authInfo, appKey)
Security note: Generating tokens on the client works only for development and testing. In production, your business server must generate and deliver tokens. Do not expose your AppKey in client code.
Step 4: Create and initialize the audio and video engine
Call alirtc.CreateAliRTCEngine to create an engine instance. Pass in your event handler object.
import (
"encoding/json"
"fmt"
"os"
"alirtc"
)
eventHandler := &VideoCallEventHandler{}
coreServicePath := "/path/to/Go/alirtc/lib/AliRtcCoreService" // Absolute path to AliRtcCoreService
h5mode := false // Set to true only for interoperability with Web (H5)
extraJobj := map[string]interface{}{
"user_specified_disable_audio_ranking": "true",
}
extraBytes, _ := json.Marshal(extraJobj)
extra := string(extraBytes)
linuxEngine := alirtc.CreateAliRTCEngine(
eventHandler,
42000, 45000, // IPC port range for communication with AliRtcCoreService process
"/tmp", // Log file directory
coreServicePath,
h5mode,
extra,
)
if linuxEngine == nil {
fmt.Fprintln(os.Stderr, "Failed to create RTC engine")
os.Exit(1)
}
Each engine instance starts one AliRtcCoreService background process (one virtual user).
Step 5: Configure audio and video properties
Call SetClientRole to set the user role. Call SetVideoEncoderConfiguration to configure video encoding parameters.
// Call SetClientRole to set the role to interactive (streamer), allowing both publishing and subscribing
linuxEngine.SetClientRole(alirtc.AliEngineClientRoleInteractive)
// Call SetVideoEncoderConfiguration to set video encoding parameters
videoConfigPtr := alirtc.NewAliEngineVideoEncoderConfiguration(
720, 1280,
alirtc.AliEngineFrameRateFps15,
1200, 0,
alirtc.AliEngineVideoEncoderOrientationModeAdaptive,
alirtc.AliEngineVideoMirrorModeDisabled,
alirtc.AliEngineRotationMode_0,
)
videoConfig := *videoConfigPtr
linuxEngine.SetVideoEncoderConfiguration(videoConfig)
Step 6: Configure stream publishing and subscription
Configure how audio and video streams are published and subscribed. Enable the Linux-specific external audio and video sources mode.
// Call PublishLocalVideoStream / PublishLocalAudioStream to enable local stream publishing
linuxEngine.PublishLocalVideoStream(true)
linuxEngine.PublishLocalAudioStream(true)
// Linux has no built-in camera or microphone. Call SetExternalVideoSource to enable external video input.
// Push YUV frame data using PushExternalVideoFrame.
linuxEngine.SetExternalVideoSource(true, alirtc.VideoSourceCamera, alirtc.RenderModeFill)
// Call SetExternalAudioSource to enable external audio input. Push PCM frame data using PushExternalAudioFrameRawData.
linuxEngine.SetExternalAudioSource(true, 16000 /* sample rate */, 1 /* number of sound channels */)
Configure subscription behavior at join time (set in JoinChannelConfig in Step 7):
joinConfig := alirtc.NewJoinChannelConfig()
joinConfig.ChannelProfile = alirtc.ChannelProfileInteractiveLive
joinConfig.PublishMode = alirtc.PublishAutomatically // Publish automatically
joinConfig.SubscribeMode = alirtc.SubscribeAutomatically // Subscribe automatically
joinConfig.PublishAvsyncMode = alirtc.PublishAvsyncWithPts
// Two options for audio subscription format:
// AudioFormatMixedPcm: Receive mixed PCM for the entire channel. Triggers OnSubscribeMixAudioFrame.
// AudioFormatPcmBeforMixing: Receive unmixed PCM per remote user. Triggers OnSubscribeAudioFrame (with uid).
joinConfig.SubscribeAudioFormat = alirtc.AudioFormatMixedPcm
// Receive H264 video frames. Triggers OnRemoteVideoSample.
joinConfig.SubscribeVideoFormat = alirtc.VideoFormatH264
Step 7: Join the channel
Call JoinChannel with the token and channel config from Step 3.
// Call JoinChannel to join the channel (single-parameter join)
linuxEngine.JoinChannel(authInfo.Token, authInfo.Channel, authInfo.UserID, authInfo.UserName, joinConfig)
Do not call JoinChannel more than once. Use GenerateToken only for development and testing. In production, obtain tokens from your business server to prevent AppKey leaks.
Step 8: Push external video frames
Linux has no built-in camera driver interface. Feed YUV video data into the SDK using external input. This example reads I420-format YUV frames in a loop from a file. In production, replace this with camera driver output or video decoder output.
import (
"os"
"time"
"alirtc"
)
go func() {
width, height, fps := 720, 1280, 15
frameSize := width * height * 3 / 2 // I420
videoFile, err := os.Open("/tmp/test_720p.yuv")
if err != nil {
panic(err)
}
defer videoFile.Close()
vTs := 0
for running {
data := make([]byte, frameSize)
n, err := videoFile.Read(data)
if err != nil || n < frameSize {
videoFile.Seek(0, 0) // Rewind to start when end of file reached
continue
}
videoSample := alirtc.VideoDataSample{
Width: width,
Height: height,
Format: alirtc.VideoDataFormatI420,
BufferType: alirtc.VideoBufferTypeRawData,
Rotation: 0,
}
videoSample.StrideY = width
videoSample.StrideU = width / 2
videoSample.StrideV = width / 2
videoSample.DataLen = frameSize
videoSample.Data = data
videoSample.TimeStamp = vTs
linuxEngine.PushExternalVideoFrame(&videoSample, alirtc.VideoSourceCamera)
vTs += 1000 / fps
time.Sleep(time.Duration(1000/fps) * time.Millisecond)
}
}()
Step 9: Push external audio frames
Linux has no built-in microphone recording interface. Feed PCM audio data into the SDK using external input. This example reads PCM frames (int16_t, 16 kHz, mono) in a loop from a file. In production, replace this with microphone driver output or audio decoder output.
go func() {
sampleRate := 16000
channels := 1
frameMs := 20 // 20 ms per frame
frameSize := (sampleRate / 1000) * frameMs * 2 * channels // int16_t = 2 bytes
audioFile, err := os.Open("/tmp/test_16k_mono.pcm")
if err != nil {
panic(err)
}
defer audioFile.Close()
aTs := 0
for running {
data := make([]byte, frameSize)
n, err := audioFile.Read(data)
if err != nil || n < frameSize {
audioFile.Seek(0, 0) // Rewind to start when end of file reached
continue
}
ret := linuxEngine.PushExternalAudioFrameRawData(data, frameSize, int64(aTs))
if ret != 0 {
// SDK buffer is full. Rewind file pointer and retry later.
audioFile.Seek(int64(-n), 1)
time.Sleep(20 * time.Millisecond)
continue
}
aTs += frameMs
time.Sleep(time.Duration(frameMs) * time.Millisecond)
}
}()
Step 10: Handle remote audio and video playback
Linux has no built-in audio or video playback devices. The SDK delivers remote audio and video frames through callbacks. Your application handles them—for example, writing to files, sending to decoders, or connecting to playback devices.
Audio playback
Based on the SubscribeAudioFormat setting in Step 6, one of these callbacks triggers when remote audio frames arrive:
// AudioFormatMixedPcm mode: Receive mixed PCM for the entire channel
func (e *VideoCallEventHandler) OnSubscribeMixAudioFrame(frame alirtc.AudioFrame) {
// frame.Pcm.Data PCM data ([ ]byte, int16_t format)
// frame.Pcm.Channels Number of sound channels
// frame.Pcm.SampleRates Sample rate
// Write to file, send to audio device, or decode and play here
}
// AudioFormatPcmBeforMixing mode: Receive unmixed PCM per remote user
func (e *VideoCallEventHandler) OnSubscribeAudioFrame(uid string, frame alirtc.AudioFrame) {
// uid identifies which remote user sent this frame
// Process audio data per user here
}
Video playback
When remote video frames arrive, the OnRemoteVideoSample callback triggers. The frame format depends on the SubscribeVideoFormat setting in Step 6:
func (e *VideoCallEventHandler) OnRemoteVideoSample(uid string, frame alirtc.VideoFrame) {
// uid identifies which remote user sent this frame
// frame.Frame.Width / frame.Frame.Height resolution
// Write to file, send to renderer, or send to video decoder here
}
Step 11: Leave the channel and destroy the engine
Release resources properly. Stop publishing, leave the channel, and then destroy the engine.
// Stop the external video push goroutine
running = false
// Wait for goroutine to exit (use sync.WaitGroup if needed)
// Call PublishLocalVideoStream(false) / PublishLocalAudioStream(false) to stop publishing
linuxEngine.PublishLocalVideoStream(false)
linuxEngine.PublishLocalAudioStream(false)
// Call LeaveChannel to leave the channel
linuxEngine.LeaveChannel()
// Wait for OnLeaveChannelResult callback (stopSignal set to true) before destroying the engine
for !stopSignal {
time.Sleep(time.Second)
}
// Call Release to destroy the engine (must be called after LeaveChannel)
linuxEngine.Release()
linuxEngine = nil
FAQ
Q: How do I fix a module error when I run go run demo.go?
Delete the go.mod files in the current directory and the alirtc directory. Then run:
cd alirtc && go mod init alirtc && cd ..
go mod init your_project
go mod edit -replace alirtc=./alirtc
go mod tidy
go run demo.go
Q: When should I set h5mode to true?
Set h5mode to true only when you need interoperability with Web (H5) pages. For Linux-to-Linux interoperability, use false.
Q: How do I handle expired tokens?
Listen for the OnAuthInfoWillExpire callback (token about to expire). Regenerate the token and call the engine’s refresh method to update credentials. You do not need to rejoin the channel.
Listen for the OnAuthInfoExpired callback (token expired). Leave the channel and rejoin with a new token.
Q: Why does the runtime report “cannot open shared object file”?
error while loading shared libraries: libAliRtcLinuxEngine.so: cannot open shared object file
Solution: Run export LD_LIBRARY_PATH=/path/to/Go/alirtc/lib:$LD_LIBRARY_PATH.