@ub|k in #patchwork

Has anyone else been having trouble with patchwork (3.6.6)? Mine sometimes gets stuck "Downloading new messages" and refuses to load anything else. I see some "connection refused" errors in the console, nothing too suspicious. After a few minutes it seems to work again, but the "Downloading new messages" indicator stays.

@ev
Re: %wv/IenEk8

It sounds to me that the scuttlebot daemon inside of Patchwork is crashing, or not responding to requests some of the time.

Do you get the same error if your clone and build scuttlebot on its' own?

@nanomonkey in #patchwork
Re: %wv/IenEk8

Yes, same problem here. If I close Patchwork and open it back up it will be fixed. For me the problem happens after I click on the yellow new messages button, after that it doesn't render out the new messages.

@ub|k in #patchwork
Re: %wv/IenEk8

@ev, not sure what you mean. If I run scuttlebot on the commandline there are still the "connection refused" errors. But I think they're unrelated.

@ev
Re: %wv/IenEk8

@ub|k What's the full error you're getting on the command line?

I'm not very experienced with Patchwork, because I don't develop with it. I'm not sure if it uses your external scuttlebot instance if it starts?

Have you tried using %patchfoo or %minbase as a client?

@ub|k
Re: %wv/IenEk8

@ev, "Error: challenge not accepted"

If I try to start it while running an external scuttlebot, it fails trying to bind its own scuttlebot to port 8008.

@ev
Re: %wv/IenEk8

@ub|k Ok. I'd say try %patchfoo or %minbase if this gets annoying. These clients aren't experiencing the issue you're having.

I think it's a Patchwork only error you're experiencing having to do with the flume progress bar. or scuttlebot is just hanging during replication? We already know about that error.

If you haven't already, file an issue at http://github.com/ssbc/scuttlebot

@Matt in #patchwork
Re: %wv/IenEk8

This is a bug in sbot legacy replication and affects all clients (that aren't using ebt).

See https://github.com/ssbc/patchwork/issues/597 and mega thread where I tried to fix this (the first time): %GbJVNVx...

I used to be able to reproduce this reliably by disconnecting from wifi when connected to peers, but mysteriously this no longer works. Which is nice and all, but makes this very difficult to debug.

What operating system and versions are you running @ub|k @nanomonkey ?

@nanomonkey in #patchwork
Re: %wv/IenEk8

Patchwork 3.6.6, OS X El Capitan (10.11.6)

@John in #patchwork
Re: %wv/IenEk8

I recommend people roll back to 3.6.4 - it's been working more reliably for me.


Heads up Matt is prepping for international travel. I really want to support him to have good energy for travel by figuring out how to resolve this without leaning on him.
We've also been talking about how it would be good to figure out how to share maintenance work of core apps across more people - it can be a pretty heavy burden.

As part of this, I'm going to meet up with him and download a bunch of known issues (into github issues) before he sets off.

For this issue, what do people think would be a good solution?
I was thinking perhaps we should remove 3.6.6 and 3.6.5 insaller downloads till someone has capacity to look closer. Or we can mark them unstable. What do people think?

@Mikey in #patchwork
Re: %wv/IenEk8

I was thinking perhaps we should remove 3.6.6 and 3.6.5 insaller downloads till someone has capacity to look closer. Or we can mark them unstable. What do people think?

which is worse, memory leak or temporary server hang?

@matt has invested a lot of energy into discovering and understanding this specific issue over the last few months, (as far i understand) the server hang is present in all versions of scuttlebot except for the one's with his temporary patch applied, but that patch leads to a memory leak.

Matt chose to revert the memory leak in favor of the temporary hang, which i actually agree is a sound decision.

so

  • i'm :-1: on removing the recent versions or marking them unstable
  • i'm :+1: on documenting the trade-offs
@Mikey in #patchwork
Re: %wv/IenEk8

good context from https://github.com/ssbc/patchwork/issues/597#issuecomment-320116067

This is triggered when a remote peer closes the connection without warning, and then your side tries to cleanup the closed connection. It can also be triggered by turning off your wifi.

This bug has existed for at least a year. And affects ALL CLIENTS. But it has only become noticeable now that there are so many FOAFs (friend-of-friends). Also, patchwork used to buffer a long way ahead in the feed (1000) which disguised this, now it only reads 50 messages ahead.

@cellular in #patchwork
Re: %wv/IenEk8

Is there anything that would block deploying ssb-ebt in patchwork?

@Matt in #patchwork
Re: %wv/IenEk8

Currently blocking EBT in patchwork:

  • wider pub support
  • updating patchwork to use the new replication progress api

We have to completely remove legacy replication from patchwork to make this go away.

Something I haven't yet tested: it might be enough to just disable permanent connections to legacy pubs and only persist connections to ebt pubs.

cc @Dominic

@dominic in #patchwork
Re: %wv/IenEk8

@cel it needs a thing so that it has just one peer in send mode per feed, (and tests) and currently initial sync over ebt is slow, that needs to be fixed. I got some stuff for that at ssb-validate