Showing posts with label go. Show all posts
Showing posts with label go. Show all posts

2023-07-29

Using Vault with Go

So today we're gonna use vault to make the configuration of an application to be in-memory, this would make debugging harder (since it's in memory, not on disk), but a bit more secure (if got hacked, have to read memory to know the credentials). 

The flow of doing this is something like this:

1. Set up Vault service in separate directory (vault-server/Dockerfile):

FROM hashicorp/vault

RUN apk add --no-cache bash jq

COPY reseller1-policy.hcl /vault/config/reseller1-policy.hcl
COPY terraform-policy.hcl /vault/config/terraform-policy.hcl
COPY init_vault.sh /init_vault.sh

EXPOSE 8200

ENTRYPOINT [ "/init_vault.sh" ]

HEALTHCHECK \
    --start-period=5s \
    --interval=1s \
    --timeout=1s \
    --retries=30 \
        CMD [ "/bin/sh", "-c", "[ -f /tmp/healthy ]" ]

2. The reseller1 ("user" for the app) policy and terraform (just name, we don't use terraform here, this could be any tool that provision/deploy the app, eg. any CD pipeline) policy is something like this:

# terraform-policy.hcl
path "auth/approle/role/dummy_role/secret-id" {
  capabilities = ["update"]
}

path "secret/data/dummy_config_yaml/*" {
  capabilities = ["create","update","read","patch","delete"]
}

path "secret/dummy_config_yaml/*" { # v1
  capabilities = ["create","update","read","patch","delete"]
}

path "secret/metadata/dummy_config_yaml/*" {
  capabilities = ["list"]
}

# reseller1-policy.hcl
path "secret/data/dummy_config_yaml/reseller1/*" {
  capabilities = ["read"]
}

path "secret/dummy_config_yaml/reseller1/*" { # v1
  capabilities = ["read"]
}

3. Then we need to create init script for docker (init_vault.sh), so it could execute required permissions when docker started (insert policies, create appRole, reset token for provisioner), something like this:

set -e

export VAULT_ADDR='http://127.0.0.1:8200'
export VAULT_FORMAT='json'
sleep 1s
vault login -no-print "${VAULT_DEV_ROOT_TOKEN_ID}"
vault policy write terraform-policy /vault/config/terraform-policy.hcl
vault policy write reseller1-policy /vault/config/reseller1-policy.hcl
vault auth enable approle

# configure AppRole
vault write auth/approle/role/dummy_role \
    token_policies=reseller1-policy \
    token_num_uses=0 \
    secret_id_ttl="32d" \
    token_ttl="32d" \
    token_max_ttl="32d"

# overwrite token for provisioner
vault token create \
    -id="${TERRAFORM_TOKEN}" \
    -policy=terraform-policy \
    -ttl="32d"

# keep container alive
tail -f /dev/null & trap 'kill %1' TERM ; wait

5. Now that all has been set up, we can create docker compose (docker-compose.yaml) to start everything with proper environment variable injection, something like this:

version: '3.3'
services:
  testvaultserver1:
    build: ./vault-server/
    cap_add:
      - IPC_LOCK
    environment:
      VAULT_DEV_ROOT_TOKEN_ID: root
      APPROLE_ROLE_ID:         dummy_app
      TERRAFORM_TOKEN:         dummyTerraformToken
    ports:
      - "8200:8200"

# run with: docker compose up 

6. Now that vault server already up, we can run a script (should be run by provisioner/CD) to retrieve an AppSecret and write it to /tmp/secret, and write our app configuration (config.yaml) to vault path with key dummy_config_yaml/reseller1/region99 something like this:

TERRAFORM_TOKEN=`cat docker-compose.yml | grep TERRAFORM_TOKEN | cut -d':' -f2 | xargs echo -n`
VAULT_ADDRESS="127.0.0.1:8200"

# retrieve secret for appsecret so dummy app can load the /tmp/secret
curl \
   --request POST \
   --header "X-Vault-Token: ${TERRAFORM_TOKEN}" \
      "${VAULT_ADDRESS}/v1/auth/approle/role/dummy_role/secret-id" > /tmp/debug

cat /tmp/debug | jq -r '.data.secret_id' > /tmp/secret

# check appsecret exists
cat /tmp/debug
cat /tmp/secret

VAULT_DOCKER=`docker ps| grep vault | cut -d' ' -f 1`

echo 'put secret'
cat config.yaml | docker exec -i $VAULT_DOCKER vault -v kv put -address=http://127.0.0.1:8200 -mount=secret dummy_config_yaml/reseller1/region99 raw=-

echo 'check secret length'
docker exec -i $VAULT_DOCKER vault -v kv get -address=http://127.0.0.1:8200 -mount=secret dummy_config_yaml/reseller1/region99 | wc -l

7. Next, we just need to creat an application that will read the AppSecret (/tmp/secret), retrieve the application config from vault key path secret dummy_config_yaml/reseller1/region99, something like this:

secretId := readFile(`/tmp/secret`)
config := vault.DefaultConfig()
config.Address = address
appRoleAuth, err := approle.NewAppRoleAuth(
    AppRoleID, -- injected on compile time = `dummy_app`
    approleSecretID)
const configPath = `
secret/data/dummy_config_yaml/reseller1/region99`
secret, err := client.Logical().Read(configPath)
data := secret.Data[`data`]
m, ok := data.(map[string]interface{})
raw, ok := m[`raw`]
rawStr, ok := raw.(string)

the content of rawStr that read from vault will have exactly the same as config.yaml.

This way if hacker already got in into the system/OS/docker, can only know the secretId, to know the AppRoleID and the config.yaml content they have to analyze from memory. Full source code can be found here.

2023-06-14

Simple Websocket Echo Benchmark

Today we're gonna benchmark nodejs/bin+uwebsocket with golang+nbio. The code is here, both taken from their own example. The benchmark plan is create 10k client/connection, send both text/binary string and receive back the from server and sleep for 1s,100ms,10ms or 1ms, the result is as expected:

go 1.20.5 nbio 1.3.16
rps: 19157.16 avg/max latency = 1.66ms/319.88ms elapsed 10.2s
102 MB 0.6 core usage
rps: 187728.05 avg/max latency = 0.76ms/167.76ms elapsed 10.2s
104 MB 5 core usage
rps: 501232.80 avg/max latency = 12.48ms/395.01ms elapsed 10.1s
rps: 498869.28 avg/max latency = 12.67ms/425.04ms elapsed 10.1s
134 MB 15 core usage

bun 0.6.9
rps: 17420.17 avg/max latency = 5.57ms/257.61ms elapsed 10.1s
48 MB 0.2 core usage
rps: 95992.29 avg/max latency = 29.93ms/242.74ms elapsed 10.4s
rps: 123589.91 avg/max latency = 40.67ms/366.15ms elapsed 10.2s
rps: 123171.42 avg/max latency = 62.74ms/293.29ms elapsed 10.1s
55 MB 1 core usage

node 18.16.0
rps: 18946.51 avg/max latency = 6.64ms/229.28ms elapsed 10.3s
59 MB 0.2 core usage
rps: 97032.08 avg/max latency = 44.06ms/196.41ms elapsed 11.1s
rps: 114449.91 avg/max latency = 72.62ms/295.33ms elapsed 10.3s
rps: 109512.05 avg/max latency = 79.27ms/226.03ms elapsed 10.2s
59 MB 1 core usage


First line until 4th line are with 1s, 100ms, 10ms, 1ms delay before next request. Since Golang/nbio is by default can utilize multi-core so can handle ~50 rps per client, while Bun/Nodejs 11-12 rps per client. If you found a bug, or want to contribute another language (or create better client, just create a pull request on the github link above.

2023-04-18

How to use DNS SDK in Golang

So we're gonna try to manipulate DNS records using go SDK (not REST API directly). I went through first 2 page of google search results, and companies that providing SDK for Go were:

  1. IBM networking-go-sdk - 161.26.0.10 and 161.26.0.11 - timedout resolving their own website
  2. AWS route53 - 169.254.169.253 - timedout resolving their own website
  3. DNSimple dnsimple-go - 162.159.27.4 and 199.247.155.53 - 160-180ms and 70-75ms from SG
  4. Google googleapis - 8.8.8.8 and 8.8.4.4 - 0ms for both from SG
  5. GCore gcore-dns-sdk-go - 199.247.155.53 and 2.56.220.2 - 0ms and 0-171ms (171ms on first hit only, the rest is 0ms) from SG

I've used google SDK before for non-DNS stuff, a bit too raw and so many required steps. You have to create a project, enable API, create service account, set permission for that account, download credentials.json, then hit using their SDK -- not really straightforward, so today we're gonna try G-Core's DNS, apparently it's very easy, just need to visit their website and sign up, profile > API Tokens > Create Token, copy it to some file (for example: .token file).

This is example how you can create a zone, add an A record, and delete everything:

 package main

import (
  "context"
  _ "embed"
  "strings"
  "time"

  "github.com/G-Core/gcore-dns-sdk-go"
  "github.com/kokizzu/gotro/L"
)

//go:embed .token
var apiToken string

func main() {
  apiToken = strings.TrimSpace(apiToken)

  // init SDK
  sdk := dnssdk.NewClient(dnssdk.PermanentAPIKeyAuth(apiToken), func(client *dnssdk.Client) {
    client.Debug = true
  })
  ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
  defer cancel()

  const zoneName = `benalu2.dev`

  // create zone
  _, err := sdk.CreateZone(ctx, zoneName)
  if err != nil && !strings.Contains(err.Error(), `already exists`) {
    L.PanicIf(err, `sdk.CreateZone`)
  }

  // get zone
  zoneResp, err := sdk.Zone(ctx, zoneName)
  L.PanicIf(err, `sdk.Zone`)
  L.Describe(zoneResp)

  // add A record
  err = sdk.AddZoneRRSet(ctx,
    zoneName,        // zone
    `www.`+zoneName, // name
    `A`,             // rrtype
    []dnssdk.ResourceRecord{
      {
// https://apidocs.gcore.com/dns#tag/rrsets/operation/CreateRRSet
        Content: []any{
          `194.233.65.174`,
        },
      },
    },
    120, // TTL
  )
  L.PanicIf(err, `AddZoneRRSet`)

  // get A record
  rr, err := sdk.RRSet(ctx, zoneName, `www.`+zoneName, `A`)
  L.PanicIf(err, `sdk.RRSet`)
  L.Describe(rr)

  // delete A record
  err = sdk.DeleteRRSet(ctx, zoneName, `www.`+zoneName, `A`)
  L.PanicIf(err, `sdk.DeleteRRSet`)

  // delete zone
  err = sdk.DeleteZone(ctx, zoneName)
  L.PanicIf(err, `sdk.DeleteZone`)
}

The full source code repo is here. Apparently it's very easy to manipulate DNS record using their SDK, after adding record programmatically, all I need to do is just delegate (set authoritative nameserver) to their NS: ns1.gcorelabs.net and ns2.gcdn.services.

In my case because I bought the domain name on google domains, then I just need to change this: 

 
Then just wait it to be delegated properly (until all DNS servers that still caching the old authorized NS cleared up), I guess that it.

2022-07-28

Techempower Framework Benchmark Round 21

 The result for Techempower framework benchmark round 21 is out, as usual the most important benchmark is the update and multi query benchmark:

The top rankers are Rust, JS (cheating), Java, C++, C#, PHP, C, Scala, Kotlin, and Go. For multiple queries:


Top rankers are Rust, Java, JS (cheating), Scala, Kotlin, C++, C#, and Go. These benchmark shows how efficient their database driver (which mostly the biggest bottleneck), and how much overhead the framework of each language (including serialization, alloc/GC, async-I/O, etc).

For memory usage and CPU utilization you can check here https://ajdust.github.io/tfbvis/?testrun=Citrine_started2022-06-30_edd8ab2e-018b-4041-92ce-03e5317d35ea&testtype=update&round=false

2022-05-08

How to structure/layer your Golang Project (or whatever language you are using)

I've been maintaing a lot of other people's project, and see that most people blindly following a framework's structure without purpose. So I write this so that people can be convinced how to write a good directory structure or good layering on your application, especially for this case when creating a service.

It would be better to split your project to exactly 3 layers:
  1. presentation (handle only serialization/deserialization, and transport)
  2. business-logic (handles pure business logic, DTO goes here)
  3. persistence (handles records and it's persistence, DAO goes here)
It could be a package with bunch of sub-package, or per domain basis, or per user role basis. For example:

# monolith example

bin/ # should be added to .gitignore
presenter/ # de/serialization, transport, formatter goes here
  grpc.go
  restapi.go
  cli.go
  ...
businesslogic/ # pure business logic and INPUT/OUTPUT=DTO struct
  role1|role2|etc.go
  domain1|domain2|etc.go
  ...
models/ # all DAO/data access model goes here
  users.go
  schedules.go
  posts.go
  ...
deploy/ 
  Dockerfile
  docker-compose.yml
  deploy.sh
pkg/ # all 3rd party helpers/lib/wrapper goes here
  mysql/
  redpanda/
  aerospike/
  minio/

# Vertical-slice example

bin/
domain1/
  presenter1.go
  business1.go
  models1.go
domain2/
  presenter2.go
  business2.go
  models2.go
...
deploy/
pkg/

Also it's better to inject per function basis instead of whole struct/interface, something like this:

type LoginIn struct {
  Email string
  Password string
  // normally I embed CommonRequest object
}
type LoginOut struct {

  // normally I embed CommonResponse object with properties:
  SessionToken string
  Error string
  Success bool
}
type GuestDeps struct {
  GetUserByEmailPass func(string, string) (*User, error)
  // other injected dependencies, eg. S3 uploader, 3rd party libs
}
func (g *GuestDeps) Login(in *LoginIn) (out LoginOut) {
  // do validation
  // do retrieve from database
  // return proper object
}

So when you need to do testing, all you need is create a fake, either with counterfeiter (if you inject an interface instead of function) or manual, then check with autogold:

func TestLogin(t *testing.T) {
  t.Run(`fail_case`, func(t *testing.T){
    in := LoginIn{}
    deps := 
GuestDeps{
      GetUserByEmailPass: func(string,string) { return nil, errors.New("failed to retrieve from db") }
    }
    out := deps.Login(in)
    want := autogold.Want("fail_case",nil)
    want.Equal(t, out) // ^ update with go test -update 
  })
}

then on the main (real server implementation, you can put real dependency something like this:

rUser := userRepo.NewPgConn(conf.PgConnStr)
srv := httpRest.New( ... )
guest := GuestDeps{
  GetUserByEmailPass: rUser.GetUserByEmailPass,
  
DoFileUpload: s3client.DoFileUpload,
  ...
}
srv.Handle[LoginIn,LoginOut](LoginPath,guest.Login)
srv.Listen(conf.HostPort)

Why are we doing like this? because usually a framework that I ever used is either insanely overlayered or no clear separation between controller and business logic (it still handles transport, serialization, etc), and validation only happenned on outermost layer or sometimes half outside half inside the business logic (which can make it vulnerable), which when we create unit test, the programmer tempted to test whole http/grpc layer instead of pure business logic. This way we can also use another kind of serialization/transport layer without having to modify the business logic side. Imagine if you use 1 controller to handle the business logic, how much hassle it is if you have to switch framework because some reasons (framework no longer maintained, performance bottleneck in framework side, the framework doesn't provide proper middleware for some fragile telemetry, need to add other kind of serialization format or protocol, etc). But with this kind of layering, for example if we want to add grpc or json-rpc or command line or switching framework, or anything else, it's easy, just need to add a layer with proper serialization/transport then call the original business logic.
 
 
mermaid link (number 2-4 is our control)

Talk is cheap, show me the code example! gingorm1 or echogorm1 is the minimal example (you can always change the framework to fiber or default net/http or any other framework, an the orm to sqlc sqlx jet. But if you are all alone don't want to inject the database functions (which against clean-architecture, but this is most sensible way) and want to test directly against the database, you can check this example fiber1 or sveltefiber (without database). Note that those just example, I would not inject the database as a function (see fiber1 for example), I would directly depend and use dockertest for managed dependencies, and only use function injection for unmanaged dependency (3rd party). Some more complex example can be found here: street.

This is only from code maintainer's point of view, there are WORST practice that I found in the past when continuing other people's project:
  • table-based entity microservice, every table or set of tables has their own microservice, it's overly granular, where some should be coupled instead (eg. user, group, roles, permission -- this should be one service instead of 2-3 services)
  • MVC microservice, one microservice for each layer of the MVC and worse it's on different repository, eg. API service for X (V), API service for Y (V), webhook for X (C), adapter for third party Z (C/M), service for persistence (M), etc -- should be separated by domains/capability instead of by MVC layer, why? because if we implement one feature, that normally in monolith we only need to change 1 repository, but if the microservice separated by MVC layer, we have to modify 2-4 microservice when implementing 1 feature, have to start 2-4 service just to debug something, which doesn't make sense. It might make a bit sense if you are using monorepo, but without it, it's more pain than the benefit.
  • pub-sub channel-goroutine without persistence, this one is ok only if the request is discardable (all or nothing), but if it's very important (money for example) you should always persist every state, and would be better if there's a worker that progressing every state into next state so we don't have to fix manually.
There's also some GOOD things I found:
  • Start with auth, metrics, logs (show request id to user, and useful response), proper dependency injection, don't let this became too late that you have to fix everything later
  • Standard go test instead of custom test framework, because go test has proper tooling on the IDEs
  • Use worker for slow things, which is make sense, don't let user wait, unless the product guy want it all sync. Also send failures to slack or telegram or something
  • CRON pattern, runs a code every specific time, this is good when you need a scheduled task (eg. billing, reminder, etc)
  • Query-job publish task, this pattern separate CRON from time dependency (query from db, get list of items to be processed, publish to MQ), so the publish task can be triggered independently (eg. if there's bug in only 1 item), and regardless of time, and the workers will pick up any kind of work that are late.
  • Separating proxy (transport/serialization) and processor (business-logic/persistence), this have a really good benefit in terms of scalability of small requests (not for upload/download big files), where we put generic reverse proxy, push it to two-way pub-sub, then return back the response. For example, we create a rest proxy, grpc proxy, json-rpc proxy, all those 3 will push to NATS, then worker/processor will process the request and return proper response, this works like lambda, so programmer only need to focus building the worker/processor part instead of generic serialization/transport, all the generic auth, logging, request id, metrics, can be handled by the proxy, programmer only need to focus on business logic.
    the chart is something like this:

    generic proxy <--> NATS <--> worker/processor <--> databases/3rd party

    this way we can also scale independently either with monolith or microservice. Service mesh? no! service bus/star is the way :3

2022-02-22

C# vs Go in Simple Benchmark

Today we're gonna retry two of my few favorite language in associative array and comb sort benchmark (compile and run, not just runtime performance, because development waiting for compilation time also important) like in the past benchmark. For installing DotNet:

wget https://packages.microsoft.com/config/ubuntu/21.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
rm packages-microsoft-prod.deb
sudo apt install apt-transport-https
sudo apt-get update
sudo apt-get install -y dotnet-sdk-6.0 aspnetcore-runtime-6.0

For installing Golang:

sudo add-apt-repository ppa:longsleep/golang-backports
sudo apt-get update
sudo apt install -y golang-1.17

Result (best of 3 runs)

cd assoc; time dotnet run
6009354 6009348 611297
36186112 159701682 23370001

CPU: 14.16s     Real: 14.41s    RAM: 1945904KB

cd assoc; time go run map.go
6009354 6009348 611297
36186112 159701682 23370001

CPU: 14.80s     Real: 12.01s    RAM: 2305384KB

This is a bit weird, usually I see that Go use less memory but slower, but in this benchmark C# that are using less memory but a bit slower (14.41s vs 12.01s), possibly because the compilation speed also included.

cd num-assoc; time dotnet run
CPU: 2.21s      Real: 2.19s     RAM: 169208KB

cd num-assoc; time go run comb.go
CPU: 0.46s      Real: 0.44s     RAM: 83100KB

What if we increase the N from 1 million to 10 million?

cd num-assoc; time dotnet run
CPU: 19.25s     Real: 19.16s    RAM: 802296KB

cd num-assoc; time go run comb.go
CPU: 4.60s      Real: 4.67s     RAM: 808940KB

If you want to contribute (if I make mistake when coding the C# or Go version of the algorithm, or if there's more efficient data structure, just fork and create a pull request, and I will redo the benchmark).

2021-12-31

String Associative Array and CombSort Benchmark 2021 Edition

Last year, we've done string associative benchmark and lesser string associative benchmark (measuring string concat operation and built-in associative array set and get), numeric comb sort benchmark and string comb sort benchmark (measuring basic array random access, string conversion, and array swap for number and string), this year's using newer server: 32-core running on 64-bit Ubuntu 21.10. This time we will skip programming languages that are no deb packages (unless the install script is just one line and doesn't ruin system directories) or no direct compile-run command like previous one, also only best of 3 runs.

$ alias time='/usr/bin/time -f "\nCPU: %Us\tReal: %es\tRAM: %MKB"

This time, NodeJS failed to complete (after waiting for an hour) with 10x more data compared to last year for assoc benchmark. Here's the spreadsheet and final result (Real duration and RAM):

LanguageCommand FlagsVersionAssocRAMNum CombRAMStr CombRAMTotalRAM
Gogo rungo1.17.512.392,305,8240.4683,0963.33245,89616.182,634,816
Javajava18-ea+15-Ubuntu-410.705,582,3080.96170,2406.93722,15218.596,474,700
Nimnim r -d:release --gc:arc1.4.217.904,200,2121.2179,4445.45633,81624.564,913,472
Pythonpypy7.3.519.703,104,4402.17140,1245.73523,26427.603,767,828
Ctcc -run0.9.2722.682,820,5681.0180,4844.80392,89628.493,293,948
Juliajulia1.6.523.983,714,4480.52255,8444.60861,07629.104,831,368
Vv -prod run0.2.4 a0a180720.368,910,8562.67124,2486.48470,13629.519,505,240
Lualuajit2.1.0-beta316.701,133,8763.27133,50410.80511,46830.771,778,848
Dartdart2.15.125.922,101,1521.40208,0565.75574,85233.072,884,060
Crystalcrystal run --release1.2.213.772,371,3207.73202,55212.08440,74833.583,014,620
Nimnim r -d:release1.4.224.923,864,1001.7679,5487.511,211,20034.195,154,848
PHPphp8.0.89.941,368,8087.74328,34424.84641,44842.522,338,600
Crystalcrystal run0.35.138.852,372,0049.97179,17622.27441,72071.092,992,900
Vv run0.2.4 a0a180751.818,911,0046.6379,71618.43470,42076.879,461,140
Pythonpython33.9.743.114,106,89229.16405,33243.08722,996115.355,235,220
Nimnim r1.4.288.053,864,0482.9379,53632.601,211,260123.585,154,844
Rubyrubyruby 2.7.4p19152.482,970,90827.15100,32052.23708,940131.863,780,168
Javascriptnodev16.13.1999.999,999,9990.71115,0446.11461,8361006.8110,576,879

FAQ

1. Why you measure the compile duration too? because developer experience also important (feedback loop, edit-rebuild/compile-run), at least for me, it would be sucks a lot if we have to wait a minute to compile before we can test something. We could always write precalculated values with C++ template to make runtime faster for example, but the compilation delay would be very sucks.
2. Why not warming up the VM first? each implementation have it's own advantage and disadvantage. We already know, that compiled language mostly faster at runtime, but at cost of relatively slower development feedback loop. Interpreted language mostly slower at runtime, especially if executed using VM that have startup overhead, in exception of one with AOT or JIT optimization. So to make it fair for every kind of implementation, we do it differently by not glorifying the runtime performance (which super make sense for server or long-lived process, but not best for development or CI cost which people often neglected), but by total performance which consist of Compile duration (if any) + VM startup duration (if any) + AOT or JIT duration (if any) + Runtime duration, so every strategy the PL's implementator use can be fairly judged.
3. Why there's no C++, VB.NET, C#, D, Object-Pascal? don't want to compile things (since there's no build and run command in one flag).  
4. Why there's no Kotlin, Scala, Rust, Elixir, Pony, Swift, Groovy, or Zig? Too lazy to add :3 you can contribute tho (create a pull request, then I'll run the benchmark again as preferably as there's precompiled binary/deb/apt/ppa repository for the compiler/interpreter).
5. Why there's no Ruby 3.1? I can't find any PPA for latest Ruby, latest one on Ubuntu 21.10 repo is Ruby2.7.

Contributorsilmanzo (Nim, Crystal, D), inkydragon (Julia)