brianpeiris

joined 2 years ago
 

Hey everyone,

It’s been a while since our last update back in January 2025, and a lot has happened since then.

At that time, we were working on our pilot at St. James Park. I’m really happy to share that everyone from that site was successfully moved into permanent housing — Tara, Brent, Mike, and Francis all got apartments. That’s exactly why we do this.

Since then, we’ve been working on building a stronger relationship with the City so we can get more tiny homes out there and help more people.

This past winter, we ran another pilot in Leslieville. We placed four tiny homes behind Lazarus House, and they’ve been doing a great job supporting the units. It’s been a really positive addition to the neighborhood.

We’ve also built a few homes out at Seeds of Hope Farm, which is another step forward for us as we move more into transitional housing.

Right now, we’re working on a couple of bigger projects that we’re excited about, and we’ll be sharing more on those soon.

Also, a big milestone — we are now a registered charity, which means we can officially provide tax receipts. Charity Number: 715703351RR0001 If you’d like to donate and receive a tax receipt, just reach out to us directly.

Thank you for sticking with us and supporting what we’re building. It means a lot.

— Ryan

https://www.gofundme.com/f/tiny-tiny-homes-affordable-housing-solutions#updates

https://tinytinyhomes.ca/

[–] brianpeiris@lemmy.ca 8 points 1 day ago* (last edited 1 day ago) (1 children)

So this whole story exists because the bigots came out of the woodwork to squee at the acronym? Good then, let them show themselves and burn in the sunlight. The acronym did its job.

[–] brianpeiris@lemmy.ca 1 points 1 day ago

I mean, no one is forcing you to memorize it.

 

Archive link: https://archive.is/MtWjq

While using ChatGPT last June, Van Rootselaar described scenarios involving gun violence over the course of several days, according to people familiar with the matter.
Her posts, flagged by an automated review system, alarmed employees at OpenAI. Internally, about a dozen staffers debated whether to take action on Van Rootselaar’s posts. Some employees interpreted Van Rootselaar’s writings as an indication of potential real-world violence, and urged leaders to alert Canadian law enforcement about her behavior, the people familiar with the matter said.

[–] brianpeiris@lemmy.ca 7 points 2 days ago

I'm very surprised they're even doing this.

[–] brianpeiris@lemmy.ca 0 points 2 days ago

So you don't want to hold OpenAI and Sam Altman accountable. Got it, thanks.

[–] brianpeiris@lemmy.ca 5 points 2 days ago

I understand you're trying to consider both sides of this for the sake of argument, but the issue I have with it is that it is justifying current real world harm in the name of hypothetical (arguably unlikely) future benefit.

[–] brianpeiris@lemmy.ca 8 points 2 days ago (5 children)

I want OpenAI to be held accountable, don't you?

 

Archive link: https://archive.is/MtWjq

OpenAI should be held accountable for this.

While using ChatGPT last June, Van Rootselaar described scenarios involving gun violence over the course of several days, according to people familiar with the matter.
Her posts, flagged by an automated review system, alarmed employees at OpenAI. Internally, about a dozen staffers debated whether to take action on Van Rootselaar’s posts. Some employees interpreted Van Rootselaar’s writings as an indication of potential real-world violence, and urged leaders to alert Canadian law enforcement about her behavior, the people familiar with the matter said.

 

Florida officials are opening an investigation into OpenAI and ChatGPT, its popular chatbot product, in part concerning its alleged assistance in helping plan a mass shooting at Florida State University last year. James Uthmeier, the state's attorney general, announced the probe Thursday morning in a video statement on X. "We've also learned that ChatGPT may likely have been used to assist the murderer in the recent mass school shooting at Florida State University that tragically took two lives," Uthmeier said.

[–] brianpeiris@lemmy.ca 1 points 2 days ago

Maybe voice over, narration, stock photography, copy writing.

[–] brianpeiris@lemmy.ca 16 points 2 days ago

I'm not so sure that power usage should be dismissed so easily just because it is distributed instead of centralized. The slop per watt rate may even be worse than at a datacenter. Fundamentally, we should care more about efficiency.

Imagine a panel of 20 standard LED light bulbs. That's 180 watts, roughly the equivalent of GPU usage while a local LLM is doing any work. If you keep that in mind, then you have to ask yourself if the benefit you're getting out of your local LLM is really worth that energy cost. Now, monetarily speaking, that's not a ton of money, because electricity is cheap, but would you flip that switch for the duration of the task you're performing? What if you could use conventional non-LLM methods to do it instead? Would that be more efficient? And where is your electricity coming from? Is it a solar farm, or a coal plant?

How was your local LLM trained? Was there copyrighted material in its training data set? Were low-wage workers asked to sift through horrendous content to clean up the data?

We need to consider the externalities, even when using local LLMs. We moved so quickly from the initial release of ChatGPT to now, that we never stopped to ask those questions. They remain unanswered until someone cares enough to think.

[–] brianpeiris@lemmy.ca 9 points 2 days ago (2 children)

Local slop is still slop

[–] brianpeiris@lemmy.ca 25 points 2 days ago

They started the meeting with a prayer about "keeping their minds on the children", followed by the most robotic sounding pledge of allegiance I've ever heard, and then they proceeded to pardon the predator, mostly using religious "grace" as a justification. America!

[–] brianpeiris@lemmy.ca 33 points 2 days ago

Buddhist Copilot builds apps with sublime coding standards, and on the last iteration it runs rm -rf * .git before it recites a koan on impermanence.

[–] brianpeiris@lemmy.ca 15 points 3 days ago (1 children)

I hope the rest of Canada isn't just annoyed at having to hear about Doug all the time. This might just be more of an Ontario (and Toronto) problem right now, but it shouldn't be ignored just because it is contained in that province.

Speaking as an ethnic Sri Lankan (but Canadian national), Doug Ford and his ilk remind me very much of the political dynasty that Sri Lanka fell victim to. Sri Lanka saw rampant corruption for decades, followed by a devastating economic crisis and government overthrow. It will take them years to recover, and the political family responsible might still get away with it.

Keep an eye on the Ford family. Aside from the Rob Ford dumpster fire, his nephew Michael Ford was previously the Ontario minister of citizenship and multiculturalism, briefly attempted to run in the Toronto mayoral race, and is now a registered lobbyist at Toronto City Hall.

I think Doug is a bit smarter though, and his political cronies and business pals are probably going to benefit much more than his family, but that doesn't make it any better for the people of Ontario or Canada.


Archive link: https://archive.is/zXQRP (Yes, I'm aware of the problems with archive.is, but I think it's important to let people bypass the paywall in this case)

 

The Canadian Space Agency has also been posting short recap logbooks on their website: https://www.asc-csa.gc.ca/eng/missions/artemis-ii/daily-logbook.asp

 

The ARC Prize organization designs benchmarks which are specifically crafted to demonstrate tasks that humans complete easily, but are difficult for AIs like LLMs, "Reasoning" models, and Agentic frameworks.

ARC-AGI-3 is the first fully interactive benchmark in the ARC-AGI series. ARC-AGI-3 represents hundreds of original turn-based environments, each handcrafted by a team of human game designers. There are no instructions, no rules, and no stated goals. To succeed, an AI agent must explore each environment on its own, figure out how it works, discover what winning looks like, and carry what it learns forward across increasingly difficult levels.

Previous ARC-AGI benchmarks predicted and tracked major AI breakthroughs, from reasoning models to coding agents. ARC-AGI-3 points to what's next: the gap between AI that can follow instructions and AI that can genuinely explore, learn, and adapt in unfamiliar situations.

You can try the tasks yourself here: https://arcprize.org/arc-agi/3

Here is the current leaderboard for ARC-AGI 3, using state of the art models

  • OpenAI GPT-5.4 High - 0.3% success rate at $5.2K
  • Google Gemini 3.1 Pro - 0.2% success rate at $2.2K
  • Anthropic Opus 4.6 Max - 0.2% success rate at $8.9K
  • xAI Grok 4.20 Reasoning - 0.0% success rate $3.8K.

ARC-AGI 3 Leaderboard
(Logarithmic cost on the horizontal axis. Note that the vertical scale goes from 0% to 3% in this graph. If human scores were included, they would be at 100%, at the cost of approximately $250.)

https://arcprize.org/leaderboard

Technical report: https://arcprize.org/media/ARC_AGI_3_Technical_Report.pdf

In order for an environment to be included in ARC-AGI-3, it needs to pass the minimum “easy for humans” threshold. Each environment was attempted by 10 people. Only environments that could be fully solved by at least two human participants (independently) were considered for inclusion in the public, semi-private and fully-private sets. Many environments were solved by six or more people. As a reminder, an environment is considered solved only if the test taker was able to complete all levels, upon seeing the environment for the very first time. As such, all ARC-AGI-3 environments are verified to be 100% solvable by humans with no prior task-specific training

 

If you've never tried Nebula before, here are some one-week guest passes you can use. Nebula is an ad-free streaming platform that focuses on quality, with a curated set of creators, and original content. It's a decent alternative to YouTube, and although their list of creators is still small, some of your favourite YouTubers might already be there.

One per person, please:

https://nebula.tv/redeem/?redemption_code=88395e26-483b-4760-98a4-2d9ebb49fb07

https://nebula.tv/redeem/?redemption_code=7428e2a2-ea90-4909-b8f9-a3448de35172

https://nebula.tv/redeem/?redemption_code=880a57ec-8a2e-4c41-b440-316d41cae8c2

I get three of these every month, so I'll post them here when they renew, unless a mod tells me not to. Just trying to help people find alternatives to YouTube (and Google).

 

Mother and child held in notorious Rio Grande Valley detention centre despite presenting visa, family says

A Canadian mother and her seven-year-old daughter, who has autism, have been detained by US Immigration and Customs Enforcement (ICE) in Texas since Saturday, family members have said.

Relatives of Tania Warner and her daughter Ayla Lucas say they were detained unlawfully. They are uncertain about what problem ICE found with their immigration paperwork.

Tania Warner and her daughter are both Canadians, with Warner originally from British Columbia. The Canadian broadcaster CTV News reported that they are being held at the notorious Rio Grande Valley Central processing centre in McAllen, Texas.

Warner, who is said to have moved to the US five years ago, lives in Kingsville, Texas, with her husband, Edward Warner, a US citizen.

 

Excerpt:

“The micro-modular shelter is working,” Morgan told CTV News. “People are finding indoor spaces. Certainly, there are still people outdoors, but [there’s] a big decline in the numbers of both encampments and people living unsheltered through the winter.”

Last fall, London City Council approved $7 million to construct and operate the 60 unit community (50 single-occupancy and 10 double-occupancy) that will house up to 70 people until April 2027.

The municipality’s Coordinated Informed Response (CIR) Team, who offer support to the unhoused, enforce encampment policies, and respond to the concerns of businesses, has witnessed the transformation of several people who moved into the MMS.

“An incredible change, we visibly see it in folks,” said Debbie Kramers, CIR manager. “We’re now visiting the MMS, going there regularly, and the conversation has changed. It’s about their future and it’s about housing. They’re actually having conversations with my team about what [life] looks like next.”

 

Excerpt:

"Even within the coding, it's not working well," said Smiley. "I'll give you an example. Code can look right and pass the unit tests and still be wrong. The way you measure that is typically in benchmark tests. So a lot of these companies haven't engaged in a proper feedback loop to see what the impact of AI coding is on the outcomes they care about. Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence."

Measures of engineering excellence, said Smiley, include metrics like deployment frequency, lead time to production, change failure rate, mean time to restore, and incident severity. And we need a new set of metrics, he insists, to measure how AI affects engineering performance.

"We don't know what those are yet," he said.

One metric that might be helpful, he said, is measuring tokens burned to get to an approved pull request – a formally accepted change in software. That's the kind of thing that needs to be assessed to determine whether AI helps an organization's engineering practice.

To underscore the consequences of not having that kind of data, Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI.

"It passed all the unit tests, the shape of the code looks right," he said. It's 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It's a dumpster fire. Throw it away. All that money you spent on it is worthless."

All the optimism about using AI for coding, Smiley argues, comes from measuring the wrong things.

"Coding works if you measure lines of code and pull requests," he said. "Coding does not work if you measure quality and team performance. There's no evidence to suggest that that's moving in a positive direction."

 

Selected developer quotes:

“I’m torn. I’d like to help provide updated data on this question but also I really like using AI!” — a developer from the original study early-2025 when asked to participate in the late-2025 study.

“I found I am actually heavily biased sampling the issues … I avoid issues like AI can finish things in just 2 hours, but I have to spend 20 hours. I will feel so painful if the task is decided as AI-disallowed.” — a developer from the new study noting selection effects when choosing what tasks to include in the study.

“my head’s going to explode if I try to do too much the old fashioned way because it’s like trying to get across the city walking when all of a sudden I was more used to taking an Uber.” — a developer from the new study noting selection effects when choosing what tasks to include in the study.

view more: next ›