Mechanize

joined 2 years ago
[–] Mechanize@feddit.it 0 points 1 week ago (2 children)

Try this one: nerdvpn instance

Not sure if they block VPNs too, but there are a bunch of other instances that proxy/clean up reddit, you can use something like libredirect to automate the redirection and instance list management.

[–] Mechanize@feddit.it 30 points 2 weeks ago

The TLDR of the paywalled article:

The order stems from fears that the chat group may have become the target of targeted cyberattacks.

The decision was made according to the report after the Commission became aware of the group's existence last month and deemed the risk of compromise too high. While there is no evidence yet that communication has actually been intercepted, the threat situation has escalated.

The Commission is now reacting with stricter IT guidelines and regular checks of employee hardware.

Recently, Dutch authorities warned of a global campaign in which Russian cybercriminals are using fake Signal support bots to lure users into traps.

I feel the title is really not in line with the article contents: the only thing it says about signal specifically it is that it lacks some security and management features common in state run infrastructure.

It seems it is more a case of generally tightening the rules about the politicians' communications' channels in general.

[–] Mechanize@feddit.it 1 points 3 months ago* (last edited 3 months ago)

I'm not sure what you are really interested in, if you are searching frontier model's capabilities with a good privacy policy... The answer is no.

If you are interested in privacy and can take an hit to performance, there's lumo by proton, which I've never tried personally, but it should use open models, and there should be the list somewhere there.

Otherwise you can go European with Mistral's Le Chat, which is not as good as the multibillion dollars companies offerings but it is quite good. I tend to use this one. Check the settings to disable data training.

Last but not least you can use a wrapper around the frontier models like the one offered by duckduckgo. There are many.

If you don't mind paying there are no logs services that give you access to KimiK2 level models. Or you could spin up something on runpod or vast ai style gpu rentals.

So. It depends.

[–] Mechanize@feddit.it 4 points 4 months ago

We really need tags on Lemmy, so I can easily find and re-read all of these captain's jokes comics, once in a while~

Great job, as usual!

[–] Mechanize@feddit.it 2 points 4 months ago

The big problem is that we have long stepped over that line.

Now even when you pay you are still shown ads (maybe, but not surely, not-targeted) and your data is still scrapped and analyzed to hell and back.

[–] Mechanize@feddit.it 4 points 4 months ago

I would personally prefer to not have AI Generated photo realistic content that can easily be mistaken for real photo in this community.

At the same time its very nature makes it really hard to moderate and control, and can easily spiral into witch hunts or limiting the content from only an handful of trusted sources, which would kill the already limited contributions outside of the ones from the wonderful anon.

The only solution I can think of is adding a vague rule, as someone already posted, asking to avoid it and be lenient on the casual trasgressors, asking, educating and warning first. Not straight out banning or demeaning.

At the same time I feel there should be low tolerance towards the amount of vulgar, harsh and honestly disheartening comments that tend to flood discussions even vaguely related to AI on lemmy. There's a really vocal group of people that floods any thread that could potentially be AI or about AI.

I get the reaction, but I think it has no place here: this has always been a wholesome, kind, community.

About artsy AI (drawings, comics, fake paintings, digital pieces etc) I don't really have a strong opinion, mostly because I don't follow this community for the art but mostly for the photos of real owls: I just think it should at least be properly tagged.

[–] Mechanize@feddit.it 30 points 4 months ago* (last edited 4 months ago) (1 children)

You can leak memory in perfectly safe Rust, because it is not a bug per se, an example is by using Box::leak

Preventing memory leaks was never in the intentions of Rust. What it tries to safeguard you from are Memory Safety bugs like the infamous and common double free.

[–] Mechanize@feddit.it 22 points 5 months ago (2 children)

You could consider a physical donation to the Internet Archive too, which is potentially more useful than a normal library because, as they say and I quote, they "try to digitize materials and make them available publicly as funding allows".

More info here: https://help.archive.org/help/how-do-i-make-a-physical-donation-to-the-internet-archive/

[–] Mechanize@feddit.it 0 points 5 months ago (1 children)

I don't have direct experience with RooCode and Cline, but I would be mighty surprised if they work with lesser models of even the old Qwen2-Coder 32B - and even that was mostly misses. I never tried the Qwen3 coder but I assume it is not drastically different.

Those small models are at most useful for some kind of smarter autocomplete, not to run a full tools framework.

BTW you could check out Aider too for a different approach, and they have a lot of benchmarks that can help you get an idea about what's needed.

[–] Mechanize@feddit.it 0 points 5 months ago

You have to wait for the annual reruns, or - if you are on PC - you can use a mod (I think it is this one?) to rerun them offline

[–] Mechanize@feddit.it 5 points 5 months ago

I think you mean: certified mail at least 30 days before renewal to cancel, we will answer between 60 to 90 days only if the termination was successful.

[–] Mechanize@feddit.it 0 points 8 months ago* (last edited 8 months ago)

I don't have them: I generated a new one modifying the prompt

Is this what you meant? If you want it in other styles I can try them out, but it will take some time

EDIT: If it was because I said it was straightforward to gen them in whatever style, it is because of the dataset Chroma used for training: I would be incredibly surprised if centaurs aren't in there

Sorry about the confusion

 

Some days ago ROCm 6.4 was officially added to the Arch repositories - which is great - but it made my current setup completely explode - which is less great - and currently I don't have the necessary will to go and come back from gdb hell...

So I've taken this opportunity to set up a podman (docker alternative) container to use the older, and for me working, ROCm 6.3.3. On the plus side this has made it even easier to test new things and do random stuff: I will probably port my Vulkan setup too, at a later date.

Long story short I've decided to clean it up a bit, place a bunch of links and comments, and share it with you all in the hope it will help someone out.

You still need to handle the necessary requirements on your host system to make everything work, but I've complete trust in you! Even if it doesn't work, it is a starting point that I hope will give some direction on what to do.

BTW I'm not an expert in this field, so some things can be undoubtedly improved.

Assumptions

  • To make this simpler I will consider, and advice to use, this kind of folder structure:
base_dir
 ├─ROCm_debian_dev
 │  └─ Dockerfile
 └─llamacpp_rocm6.33
    ├─ logs
    │   └─ logfile.log
    ├─ workdir
    │   └─ entrypoint.sh
    ├─ Dockerfile
    └─ compose.yaml
  • I've tested this on Arch Linux. You can probably make it work on basically any current, and not too old distro, but it's untested.

  • You should follow the basic requirements from the AMD documentation, and cross your fingers. You can probably find a more precise guide on your distro wiki. Or just install any and all ROCm and HIP related SDKs. Sigh.

  • I'm using podman, which is an alternative to docker. It has some idiosyncrasies - which I will not get into because they would require another full write-up, so if you use docker it is possible you'll need to modify some things. I can't help you there.

  • This is given with no warranty: if your computer catches on fire, it is on you (code MIT/Apache 2 license, the one you prefer; text CC BY-SA 4.0). More at the end.

  • You should know what 'generation' of card yours is. ROCm works in mysterious ways and each card has its problems. Generally you can just steam roll forward, with no care, but you still need to find which HSA_OVERRIDE_GFX_VERSION your card needs to run under. For example for a rx6600xt/rx6650xt it would be gfx1030 and HSA_OVERRIDE_GFX_VERSION=10.3.0. Some info here: Compatibility Matrix You can (not so) easily search for the correct gfx and HSA codes on the web. I don't think the 9xxx series is currently supported, but I could be wrong.

  • There's an official Docker image in the llama.cpp repository, you could give that one a go. Personally I like doing them myself, so I understand what is going on when I inevitably bleed on the edge - in fact I didn't even consider the existence of an official Dockerfile until after writing this post.. Whelp. Still, they are two different approaches, pick your poison.

Dockerfile(s)

These can, at the higher level, be described as the recipe with which we will set up the virtual machine that will compile and run llama.cpp for us.

I will put here two Dockerfile, one can be used as a fixed base, while the second one can be re-built everytime you want to update llama.cpp.

Now, this will create a new container each time, we could use a volume (like a virtual directory shared between the host machine and the container) to just git pull the new code instead of cloning, but that would almost completely disregard the pro of running this in a container. TLDR: For now don't overthink it and go with the flow.

Base image

This is a pretty basic recipe, it gets the official dev-ubuntu image by AMD and then augment it to be suitable for our needs: you can easily use other versions of ROCm (for example dev-ubuntu-24.04:6.4-complete) or even ubuntu. You can find the filtered list of the images here: Link

Could we use a lighter image? Yes. Should we? Probably. Maybe next time.

tbh I've tried other images with no success, or they needed too much effort for a minimal reward: this Just Works™. YMMV.

base_dir/ROCm_debian_dev/Dockerfile

# This is the one that currently works for me, you can
# select a different one:
#   https://hub.docker.com/r/rocm/dev-ubuntu-24.04/tags
FROM docker.io/rocm/dev-ubuntu-24.04:6.3.3-complete
# 6.4.0
# FROM docker.io/rocm/dev-ubuntu-24.04:6.4-complete

# We update and then install some stuff.
# In theory we could delete more things to make the final
# image slimmer.
RUN apt-get update && apt-get install -y \
    build-essential \
    git \
    cmake \
    libcurl4-openssl-dev \
    && rm -rf /var/lib/apt/lists/*

It is a big image, over 30GB (around 6 to download for 6.3.3-complete and around 4 for 6.4-complete) in size.

Let's build it:

cd base_dir/ROCm_debian_dev/
podman build -t rocm-6.3.3_ubuntu-dev:latest .

This will build it and add it to your local images (you can see them with podman images) with the name rocm-6.3.3_ubuntu-dev and the tag latest. You can change them as you see fit, obviously. You can even give multiple tags to the same image, a common way is to have a more specific tag and then add the tag latest to the last one you have generated, so you don't have to change the other scripts that reference it. More info here: podman tag

The real image

The second image is the one that will handle the llama.cpp[server|bench] compilation and then execution, and you need to customize it:

  • You should modify the number after the -j based on the number of virtual cores that your CPU has, minus one. You can probably use nproc in a terminal to check for it.
  • You have to change the AMDGPU_TARGETS code based on your gfx version! pay attention, because the correct one is probably not the one returned by rocminfo, for example the rx6650xt is gfx1032, but that is not directly supported by ROCm. You have to use the supported (and basically identical) gfx1030 instead.

If you want to compile with a ROCm image after 6.3 you need to swap the commented lines. Still, no idea if it works or if it is even supported by llama.cpp.

More info, and some tips, here: Link

base_dir/llamacpp_rocm6.33/Dockerfile

FROM localhost/rocm-6.3.3_ubuntu-dev:latest

# This could be shortened, but I like to have multiple
# steps to make it clear, and show how to achieve
# things in different ways.
WORKDIR /app
RUN git clone https://github.com/ggml-org/llama.cpp.git
WORKDIR /app/llama.cpp
RUN mkdir build_hip
WORKDIR build_hip
# This will run the cmake configuration.
# Pre  6.4 -DAMDGPU_TARGETS=gfx1030
RUN HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" cmake -S .. -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release
# Post 6.4 -DGPU_TARGETS=gfx1030
# RUN HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" cmake -S .. -DGGML_HIP=ON -DGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release
# Here we build the binaries, both for the server and the bench.
RUN cmake --build . --config Release -j7 --target llama-server
RUN cmake --build . --config Release -j7 --target llama-bench

To build this one we will need to use a different command:

cd base_dir/llamacpp_rocm6.33/
podman build --no-cache -t rocm-6.3.3_llamacpp:b1234 .

As you can see we have added the --no-cache long flag, this is to make sure that the image gets compiled, otherwise it would just keep outputting the same image over and over from the cache - because the recipe didn't change. This time the tag is a b1234 placeholder, you should use the current release build number or the current commit short hash of llama.cpp (you can easily find them when you start the bin, or by going on the github page) to remember at which point you have compiled, and use the dynamic latest tag as a supplementary bookmark. The current date is a good candidate too.

If something doesn't feel right - for example your GPU is not running when you make a request to the server - you should try to read the configuration step logs, to see that everything required has been correctly set up and there are no errors.

Let's compose it up

Now that we have two images that have compiled without any kind of error we can use them to reach our goal. I've heavily commented it, so just read and modify it directly. Don't worry too much about all the lines, but if you are curious - and you should - you can easily search for them and find a bunch of explanations that are surely better than what I could write here without occupying too much space.

Being a yaml file - bless the soul of whoever decided that - pay attention to the whitespaces! They matter!

We will use two Volumes, one will point to the folder where you have downloaded your GGUF files. The second one will point to where we have the entrypoint.sh file. We are putting the script into a volume instead of backing it into the container so you can easily modify it, to experiment.

A small image that you could use as a benchmark to see if everything is working is Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf.

base_dir/llamacpp_rocm6.33/compose.yaml

# Benchmark image: https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/blob/main/Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf
# bechmark command:
#    ./bin/llama-bench -t 7 -m /app/models/Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf -ngl 99 -fa 1 -ctk q4_0 -ctv q4_0
#    ./bin/llama-bench -t 7 -m /app/models/Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf -ngl 99
services:
    llamacpp-server:
        # If you have renamed the image, change it here too!
        image: localhost/rocm-6.3.3_llamacpp:latest
        # The subsequent two lines are needed to enter the image and directly use bash:
        # start it with [podman-compose up -d|docker compose up -d]
        # and then docker attach to the container with
        # [podman|docker] attach ID
        # You'll need to change the entrypoint.sh file too, just with the
        # shebang and a line straight up calling `bash`, as content.
        stdin_open: true
        tty: true
        # end bash section, Comment those two lines if you don't need shell
        # access. Or leave them.
        group_add:
            # The video group is needed on most distros to access the GPU
            # the render group is not present in some and needed
            # in others. Try it out.
            - "video" # 985 # video group - "render" # 989 # render
        environment:
            # FIXME: Change this with the right one!
            # If you have a wrong one it will _not work_.
            - HSA_OVERRIDE_GFX_VERSION=10.3.0
        devices:
            - /dev/kfd:/dev/kfd
            - /dev/dri:/dev/dri
        cap_add:
            - SYS_PTRACE
        logging:
            # The default logging driver is journald, which I despise
            # because it can pollute it up pretty hard.
            #
            # The none driver will not save the logs anywhere.
            # You can still attach to the container, but you will lose
            # the lines before the attachment.
            # driver: none
            #
            # The json-file option is deprecated, so we will use the
            # k8s-file one.
            # You can use `podman-compose logs -f` to keep tabs, and it will not
            # pollute the system journal.
            # Remember to `podman-compose down` to stop the container.
            # `ctrl+c`ing the logs will do nothing.
            driver: k8s-file
            options:
                max-size: "10m"
                max-file: "3"
                # You should probably use an absolute path.
                # Really.
                path: ./logs/logfile.log
        # This is mostly a fix for how podman net stack works.
        # If you are offline when starting the image it would just not
        # start, erroring out. Making it in host mode solves this
        # but it has other cons.
        # Reading the issue(https://github.com/containers/podman/issues/21896) it is
        # probably fixed, but I still have to test it out.
        # It meanly means that you can't have multiple of this running because they will
        # take the same port. Lucky you you can change the port from the llama-server
        # command in the entrypoint.sh script.
        network_mode: "host"
        ipc: host
        security_opt:
            - seccomp:unconfined
        # These you really need to CHANGE.
        volumes:
            # FIXME: Change these paths! Only the left side before the `:`.
            #        Use absolute paths.
            - /path/on/your/machine/where/the/ggufs/are:/app/models
            - /path/to/rocm6.3.3-llamacpp/workdir:/workdir
        # It doesn't work with podman-compose
        # restart: no
        entrypoint: "/workdir/entrypoint.sh"
        # To make it easy to use I've added a number of env variables
        # with which you can set the llama.cpp command params.
        # More info in the bash script, but they are quite self explanatory.
        command:
            - "${MODEL_FILENAME:-Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf}"
            - "${GPU_LAYERS:-22}"
            - "${CONTEXT_SIZE:-8192}"
            - "${CALL_TYPE:-bench}"
            - "${CPU_THREADS:-7}"

Now that you have meticulously modified the above file let's talk about the script that will launch llama.cpp.

base_dir/llamacpp_rocm6.33/workdir/entrypoint.sh

#!/bin/bash
cd /app/llama.cpp/build_hip || exit 1
MODEL_FILENAME=${1:-"Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf"}
GPU_LAYERS=${2:-"22"}
CONTEXT_SIZE=${3:-"8192"}
CALL_TYPE=${4:-"server"}
CPU_THREADS=${5:-"7"}

if [ "$CALL_TYPE" = "bench" ]; then
  ./bin/llama-bench -t "$CPU_THREADS" -m /app/models/"$MODEL_FILENAME" -ngl "$GPU_LAYERS"
elif [ "$CALL_TYPE" = "fa-bench" ]; then
  ./bin/llama-bench -t "$CPU_THREADS" -m /app/models/"$MODEL_FILENAME" -ngl "$GPU_LAYERS" -fa 1 -ctk q4_0 -ctv q4_0
elif [ "$CALL_TYPE" = "server" ]; then
  ./bin/llama-server -t "$CPU_THREADS" -c "$CONTEXT_SIZE" -m /app/models/"$MODEL_FILENAME" -fa -ngl "$GPU_LAYERS" -ctk q4_0 -ctv q4_0
else
  echo "Valid modalities are \"bench\", \"fa-bench\" or \"server\""
  exit 1
fi

exit 0

This is straightforward. It enters the folder (inside the container) where we built the binary and then calls the right command, decided with an env var. I've set it up to handle some common options, so you don't have to change the script every time you want to run a different model or change the number of layers loaded on VRAM.

The beauty of it is that you could put a .env file in the llamacpp_rocm6.33 folder with the params you want to use, and just start the container.

An example .env file could be:

base_dir/llamacpp_rocm6.33/.env

MODEL_FILENAME=Meta-Llama-3.1-8B-Instruct-IQ4_XS.gguf
GPU_LAYERS=99
CONTEXT_SIZE=8192
CALL_TYPE=bench
CPU_THREADS=7

Some notes:

  • For now it uses flash attention by default with a quantized context. You can avoid this by deleting the -fa and the -ctk q4_0 -ctv q4_0. Experiment around.
  • You could add more params or environmental variables: it is easy to do. How about one for the port number?
  • Find more info about llama.cpp server here: Link.
  • And the bench here: Link.
  • For now I've set up three commands, one is the server, one is a plain bench and another is a bench with FlashAttention enabled. server, bench, fa-bench.

Time to start it

Starting it is just a command away:

cd base_dir/llamacpp_rocm6.33/
podman-compose up -d
podman-compose logs -f

When everything is completely loaded, open your browser and go to http://127.0.0.1:8080/ to be welcomed by the llama.cpp webui and test if the GPU is being used. (I've my fingers crossed for you!)

Now that everything is working, have fun with your waifus and/or husbandos! ..Sorry, I meant, be productive with your helpful assistant!

When you are done, in the same folder, run podman-compose down to mercilessly kill them off.

Licensing

I know, I know. But better safe than sorry.

All the code, configurations and comments in them not otherwise already under other licenses or under copyright by others, are dual licensed under the MIT and Apache 2 licenses, Copyright 2025 [Mechanize@feddit.it](https://feddit.it/u/Mechanize) . Take your pick.

All the other text of the post © 2025 by Mechanize@feddit.it is licensed under CC BY-SA 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-sa/4.0/

 

Last night Organic Maps was removed from the Play Store without any warnings or additional details due to "not meeting the requirements for the Family Program". Compared to Google Maps and other maps apps rated for 3+ age, there are no ads or in-app purchases in Organic Maps. We have asked for an appeal.

As a temporary workaround for the Google Play issue, you can install the new upcoming Google Play update from this link: https://cdn.organicmaps.app/apk/OrganicMaps-24081605-GooglePlay.apk

The Announcement on various Networks: Fosstodon Post
Twitter Post
Telegram Post

If you don't know what Organic Maps is, it is an alternative to OsmAnd and google maps, more info on the official site (link) and GitHub.

Maybe an error? Honestly this is a weird one. I hope we will learn more in the coming hours.

You can still get it on the other channels, like F-Droid or Obtainium. Still, we all know that not being on the Play Store is an heavy sentence for any Android app.

EDITs

  • Added F-Droid link.
  • Fixed Typo in the obtainium link.
view more: next ›