Skip to content

v0.1.0 SDK update #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 59 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,49 +9,86 @@ llama-stack-client-swift brings the inference and agents APIs of [Llama Stack](h
- **Inference & Agents:** Leverage remote Llama Stack distributions for inference, code execution, and safety.
- **Custom Tool Calling:** Provide Swift tools that Llama agents can understand and use.

## Quick Demo
See [here](https://github.com/meta-llama/llama-stack-apps/tree/ios_demo/examples/ios_quick_demo/iOSQuickDemo) for a complete iOS demo ([video](https://drive.google.com/file/d/1HnME3VmsYlyeFgsIOMlxZy5c8S2xP4r4/view?usp=sharing)) using a remote Llama Stack server for inferencing.

## Installation

1. Xcode > File > Add Package Dependencies...
1. Click "Xcode > File > Add Package Dependencies...".

2. Add this repo URL at the top right: `https://github.com/meta-llama/llama-stack-client-swift`.

3. Select and add `llama-stack-client-swift` to your app target.

2. Add this repo URL at the top right: `https://github.com/meta-llama/llama-stack-client-swift`
4. On the first build: Enable & Trust the OpenAPIGenerator extension when prompted.

3. Select and add `llama-stack-client-swift` to your app target
5. Set up a remote Llama Stack distributions, assuming you have a [Fireworks](https://fireworks.ai/account/api-keys) or [Together](https://api.together.ai/) API key, which you can get easily by clicking the link:

4. On the first build: Enable & Trust the OpenAPIGenerator extension when prompted
```
conda create -n llama-stack python=3.10
conda activate llama-stack
pip install llama-stack=0.1.0
```
Then, either:
```
llama stack build --template fireworks --image-type conda
export FIREWORKS_API_KEY="<your_fireworks_api_key>"
llama stack run fireworks
```
or
```
llama stack build --template together --image-type conda
export TOGETHER_API_KEY="<your_together_api_key>"
llama stack run together
```

5. `import LlamaStackClient` and test out a call:
The default port is 5000 for `llama stack run` and you can specify a different port by adding `--port <your_port>` to the end of `llama stack run fireworks|together`.

6. Replace the `RemoteInference` url below with the your host IP and port:

```swift
import LlamaStackClient

let inference = RemoteInference(url: URL(string: "http://127.0.0.1:5000")!)

do {
for await chunk in try await inference.chatCompletion(
request:
Components.Schemas.ChatCompletionRequest(
messages: [
.UserMessage(Components.Schemas.UserMessage(
content: .case1("Hello Llama!"),
role: .user)
)
], model_id: "Meta-Llama3.1-8B-Instruct",
stream: true)
) {
request:
Components.Schemas.ChatCompletionRequest(
messages: [
.UserMessage(Components.Schemas.UserMessage(
content: .case1(userInput),
role: .user)
)
], model_id: "meta-llama/Llama-3.1-8B-Instruct",
stream: true)
) {
switch (chunk.event.delta) {
case .case1(let s):
print(s)
case .ToolCallDelta(_):
case .TextDelta(let s):
print(s.text)
break
case .ImageDelta(let s):
print("> \(s)")
break
case .ToolCallDelta(let s):
print("> \(s)")
break
}
}
}
catch {
print("Error: \(error)")
}
```

## Contributing

### Syncing the API spec

Llama Stack types are generated from the OpenAPI spec in the [main repo](https://github.com/meta-llama/llama-stack).
That spec is synced to this repo via a git submodule and script. We'll typically take care of this and you shouldn't need to run this.
Llama Stack `Types.swift` file is generated from the Llama Stack [API spec](https://github.com/meta-llama/llama-stack/blob/main/docs/resources/llama-stack-spec.yaml) in the main [Llama Stack repo](https://github.com/meta-llama/llama-stack). That spec is synced to this repo via a git submodule and script. You shouldn't need to run this, unless the API spec and your remote server get updated.

```
git submodule update --init --recursive
scripts/generate_swift_types.sh
```

This will update the `openapi.yaml` file in the Llama Stack Swift SDK source folder `Sources/LlamaStackClient`.

4 changes: 2 additions & 2 deletions Sources/LlamaStackClient/Agents/Agents.swift
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import OpenAPIURLSession
public protocol Agents {
func create(request: Components.Schemas.CreateAgentRequest) async throws -> Components.Schemas.AgentCreateResponse

func createSession(request: Components.Schemas.CreateAgentSessionRequest) async throws -> Components.Schemas.AgentSessionCreateResponse
func createSession(agent_id: String, request: Components.Schemas.CreateAgentSessionRequest) async throws -> Components.Schemas.AgentSessionCreateResponse

func createTurn(request: Components.Schemas.CreateAgentTurnRequest) async throws -> AsyncStream<Components.Schemas.AgentTurnResponseStreamChunk>
func createTurn(agent_id: String, session_id: String, request: Components.Schemas.CreateAgentTurnRequest) async throws -> AsyncStream<Components.Schemas.AgentTurnResponseStreamChunk>
}
100 changes: 27 additions & 73 deletions Sources/LlamaStackClient/Agents/ChatAgent.swift
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,10 @@ class ChatAgent {
return session
}

public func createAndExecuteTurn(request: Components.Schemas.CreateAgentTurnRequest) async throws -> AsyncStream<Components.Schemas.AgentTurnResponseStreamChunk> {
public func createAndExecuteTurn(agent_id: String, session_id: String, request: Components.Schemas.CreateAgentTurnRequest) async throws -> AsyncStream<Components.Schemas.AgentTurnResponseStreamChunk> {
return AsyncStream<Components.Schemas.AgentTurnResponseStreamChunk> { continuation in
Task {
let session = sessions[request.session_id]
let session = sessions[session_id]
let turnId = UUID().uuidString
let startTime = Date()

Expand All @@ -46,78 +46,15 @@ class ChatAgent {
))
))
)

// TODO: Build out step history
let steps: [Components.Schemas.Turn.stepsPayloadPayload] = []
var outputMessage: Components.Schemas.CompletionMessage? = nil

for await chunk in self.run(
session: session!,
turnId: turnId,
inputMessages: request.messages.map { $0.toChatCompletionRequest() },
attachments: request.attachments ?? [],
samplingParams: agentConfig.sampling_params
) {
let payload = chunk.event.payload
switch (payload) {
case .AgentTurnResponseStepStartPayload(_):
break
case .AgentTurnResponseStepProgressPayload(_):
break
case .AgentTurnResponseStepCompletePayload(let step):
switch (step.step_details) {
case .InferenceStep(let step):
outputMessage = step.model_response
case .ToolExecutionStep(_):
break
case .ShieldCallStep(_):
break
case .MemoryRetrievalStep(_):
break
}
case .AgentTurnResponseTurnStartPayload(_):
break
case .AgentTurnResponseTurnCompletePayload(_):
break
}

continuation.yield(chunk)
}

let turn = Components.Schemas.Turn(
input_messages: request.messages.map { $0.toAgenticSystemTurnCreateRequest() },
output_attachments: [],
output_message: outputMessage!,
session_id: request.session_id,
started_at: Date(),
steps: steps,
turn_id: turnId
)

await MainActor.run {
var s = self.sessions[request.session_id]
s!.turns.append(turn)
}

continuation.yield(
Components.Schemas.AgentTurnResponseStreamChunk(
event: Components.Schemas.AgentTurnResponseEvent(
payload:
.AgentTurnResponseTurnCompletePayload(Components.Schemas.AgentTurnResponseTurnCompletePayload(
event_type: .turn_complete,
turn: turn))
)
)
)
}
}
}

public func run(
session: Components.Schemas.Session,
turnId: String,
inputMessages: [Components.Schemas.ChatCompletionRequest.messagesPayloadPayload],
attachments: [Components.Schemas.Attachment],
inputMessages: [Components.Schemas.Message],
attachments: [Components.Schemas.Turn.output_attachmentsPayload],
samplingParams: Components.Schemas.SamplingParams?,
stream: Bool = false
) -> AsyncStream<Components.Schemas.AgentTurnResponseStreamChunk> {
Expand All @@ -129,44 +66,61 @@ class ChatAgent {
messages: inputMessages,
model_id: agentConfig.model,
stream: true,
tools: agentConfig.toolDefinitions
tools: [] //agentConfig.client_tools
)
) {
switch(chunk.event.delta) {
case .case1(let s):
case .TextDelta(let s):
continuation.yield(
Components.Schemas.AgentTurnResponseStreamChunk(
event: Components.Schemas.AgentTurnResponseEvent(
payload:
.AgentTurnResponseStepProgressPayload(
Components.Schemas.AgentTurnResponseStepProgressPayload(
delta: .TextDelta(s),
event_type: .step_progress,
model_response_text_delta: s,
step_id: UUID().uuidString,
step_type: .inference
)
)
)
)
)
case .ToolCallDelta(let toolDelta):
case .ImageDelta(let s):
continuation.yield(
Components.Schemas.AgentTurnResponseStreamChunk(
event: Components.Schemas.AgentTurnResponseEvent(
payload:
.AgentTurnResponseStepProgressPayload(
Components.Schemas.AgentTurnResponseStepProgressPayload(
delta: .ImageDelta(s),
event_type: .step_progress,
step_id: UUID().uuidString,
step_type: .inference,
tool_call_delta: toolDelta
step_type: .inference
)
)
)
)
)
case .ToolCallDelta(let s):
continuation.yield(
Components.Schemas.AgentTurnResponseStreamChunk(
event: Components.Schemas.AgentTurnResponseEvent(
payload:
.AgentTurnResponseStepProgressPayload(
Components.Schemas.AgentTurnResponseStepProgressPayload(
delta: .ToolCallDelta(s),
event_type: .step_progress,
step_id: UUID().uuidString,
step_type: .inference
)
)
)
)
)
}
}
continuation.finish()
} catch {
print("Error occurred: \(error)")
}
Expand Down
37 changes: 32 additions & 5 deletions Sources/LlamaStackClient/Agents/CustomTools.swift
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@ import OpenAPIRuntime

public class CustomTools {

public class func getCreateEventTool() -> Components.Schemas.FunctionCallToolDefinition {
return Components.Schemas.FunctionCallToolDefinition(
// for chat completion (inference) tool calling
public class func getCreateEventTool() -> Components.Schemas.ToolDefinition {
return Components.Schemas.ToolDefinition(
description: "Create a calendar event",
function_name: "create_event",
parameters: Components.Schemas.FunctionCallToolDefinition.parametersPayload(
parameters: Components.Schemas.ToolDefinition.parametersPayload(
additionalProperties: [
"event_name": Components.Schemas.ToolParamDefinition(
description: "The name of the meeting",
Expand All @@ -26,7 +26,34 @@ public class CustomTools {
),
]
),
_type: .function_call
tool_name: Components.Schemas.ToolDefinition.tool_namePayload.case2( "create_event")

)
}

// for agent tool calling
public class func getCreateEventToolForAgent() -> Components.Schemas.ToolDef {
return Components.Schemas.ToolDef(
description: "Create a calendar event",
metadata: nil,
name: "create_event",
parameters: [
Components.Schemas.ToolParameter(
description: "The name of the meeting",
name: "event_name",
parameter_type: "string",
required: true),
Components.Schemas.ToolParameter(
description: "Start date in yyyy-MM-dd HH:mm format, eg. '2024-01-01 13:00'",
name: "start",
parameter_type: "string",
required: true),
Components.Schemas.ToolParameter(
description: "End date in yyyy-MM-dd HH:mm format, eg. '2024-01-01 14:00'",
name: "end",
parameter_type: "string",
required: true)
]
)
}
}
30 changes: 14 additions & 16 deletions Sources/LlamaStackClient/Agents/LocalAgents.swift
Original file line number Diff line number Diff line change
Expand Up @@ -22,30 +22,28 @@ public class LocalAgents: Agents {
instructions: "You are a helpful assistant",
max_infer_iters: 1,
model: "Meta-Llama3.1-8B-Instruct",
output_shields: [],
tools: [
Components.Schemas.AgentConfig.toolsPayloadPayload.FunctionCallToolDefinition(
CustomTools.getCreateEventTool()
)
]
output_shields: []
// tools: [
// Components.Schemas.AgentConfig.toolsPayloadPayload.FunctionCallToolDefinition(
// CustomTools.getCreateEventTool()
// )
// ]
)
)
)
let agentId = createSystemResponse.agent_id

let createSessionResponse = try await createSession(
request: Components.Schemas.CreateAgentSessionRequest(agent_id: agentId, session_name: "pocket-llama")
let createSessionResponse = try await createSession(agent_id: agentId,
request: Components.Schemas.CreateAgentSessionRequest(session_name: "pocket-llama")
)
let agenticSystemSessionId = createSessionResponse.session_id

let request = Components.Schemas.CreateAgentTurnRequest(
agent_id: agentId,
messages: messages,
session_id: agenticSystemSessionId,
stream: true
)

return try await createTurn(request: request)
return try await createTurn(agent_id: agentId, session_id: agenticSystemSessionId, request: request)
}

public func create(request: Components.Schemas.CreateAgentRequest) async throws -> Components.Schemas.AgentCreateResponse {
Expand All @@ -60,16 +58,16 @@ public class LocalAgents: Agents {
return Components.Schemas.AgentCreateResponse(agent_id: agentId)
}

public func createSession(request: Components.Schemas.CreateAgentSessionRequest) async throws -> Components.Schemas.AgentSessionCreateResponse {
let agent = agents[request.agent_id]
public func createSession(agent_id: String, request: Components.Schemas.CreateAgentSessionRequest) async throws -> Components.Schemas.AgentSessionCreateResponse {
let agent = agents[agent_id]
let session = agent!.createSession(name: request.session_name)
return Components.Schemas.AgentSessionCreateResponse(
session_id: session.session_id
)
}

public func createTurn(request: Components.Schemas.CreateAgentTurnRequest) async throws -> AsyncStream<Components.Schemas.AgentTurnResponseStreamChunk> {
let agent = agents[request.agent_id]!
return try await agent.createAndExecuteTurn(request: request)
public func createTurn(agent_id: String, session_id: String, request: Components.Schemas.CreateAgentTurnRequest) async throws -> AsyncStream<Components.Schemas.AgentTurnResponseStreamChunk> {
let agent = agents[agent_id]!
return try await agent.createAndExecuteTurn(agent_id: agent_id, session_id: session_id, request: request)
}
}
Loading