Back to chat
This room does not have any messages yet
No room selected, please select a room to view it's messages.
[groovecoder-pto]: davidwalsh: what page?
[openjck]: Does anybody know when we pushed last? Chief history shows nothing.
[davidwalsh]: list_documents view is the one taking forever
[groovecoder-pto]: openjck:
[davidwalsh]: Oh and now it says the recent doc feed is blown up
[groovecoder-pto]: davidwalsh: yeah, but I think it's more than just that view/url/page
[groovecoder-pto]: davidwalsh: that's just the view that's suffering the worst
[davidwalsh]: Would re-pushing be a good or bad thing?
[davidwalsh]: Why am I not authorized to file an Infrastructure bug?
[davidwalsh]: That pisses me off
[groovecoder-pto]: davidwalsh: that is weird
[openjck]: davidwalsh: Are you logged into to Bugzilla?
[davidwalsh]: groovecoder|pto: I tried to handle this, sorry you have to pop in
[davidwalsh]: #it pointed me to #webops, which I try to join and nothing happens
[groovecoder-pto]: davidwalsh: yeah I tried that too :(
[Jeremie]: openjck: I looks like it is less slow now.
[groovecoder-pto]: davidwalsh: we have cturra and cyliang in here with us though ...
[davidwalsh]: groovecoder|pto: OK, didin't know they were webops/it
[openjck]: Jeremie: Yeah, it seems to be cycling between better and worse
[davidwalsh]: cturra: cyliang: New Relic is having a 'mare, can you help us figure out what's gone wrong?
[groovecoder-pto]: but this coincides closely to our last push. looks like we pushed when MDN audience was signing off, and then things started getting bad when Europe starting signing on
[Jeremie]: Okay, good luck :)
[davidwalsh]: All of our PRs this week are client-side, with the exception of contributor bar
[cyliang]: c.turra is out today (holiday where he is)
[vectorvictor]: NR_ALERT: Alert downtime recovered for -- Triggered by: unable to ping -- Apps currently involved:
[cyliang]: I'm not as familiar with New Relic: let me try to loop in someone else.
[sheeri]: I am!
[groovecoder-pto]: davidwalsh: I disabled top_contributors flag just in case - though it was only active for super-users and beta testers
[groovecoder-pto]: cyliang: can you help us get back?
[openjck]: It looks like response time became normal about 5 minutes ago.
[openjck]: We should remember that the contributor thing was disabled /after/ that.
[vectorvictor]: NR_ALERT: Alert escalated to downtime for -- Triggered by: unable to ping -- Apps currently involved:
[davidwalsh]: openjck: Do you know how I can get access to:
[openjck]: davidwalsh: That URL looks busted -- it should be Infrastructure%20%26%20 and should not contain the ampersand itself
[openjck]: I'll pastebin you the right URL just in case your IRC client is messing it up
[davidwalsh]: OK
[cyliang]: groovecoder|pto: I can try - not familiar with that end of things.
[openjck]: davidwalsh:
[openjck]: davidwalsh: I also opened a bug and cc'd you
[davidwalsh]: openjck: OK, thanks
[davidwalsh]: "blocker" pages someone, yes?
[openjck]: I believe so
[groovecoder-pto]: davidwalsh, openjck: this is the deploy changeset I'm looking at:
[groovecoder-pto]: davidwalsh, openjck: seem right?
[vectorvictor]: NR_ALERT: Alert downtime recovered for -- Triggered by: unable to ping -- Apps currently involved:
[openjck]: Seems right to me
[cyliang]: groovecoder|pto: did that URL always work over HTTPS? (It works over HTTP)
[davidwalsh]: Hmm, that looks big
[groovecoder-pto]: cyliang: yeah, it's always worked over https
[groovecoder-pto]: cyliang: did we refresh some ssl certs or something?
[davidwalsh]: Though I couldn't push the one time I merged because VPN wouldn't load
[cyliang]: groovecoder|pto: No cert refreshes that I know of, but I know of some other tweaks this past week which might have had an effect.
[cyliang]: groovecoder|pto: what was the time you / someone visited the https url?
[solarce1]: chief has never been setup over https
[solarce1]: that's not how we deploy it since it's private only
[solarce1]: is the site slow for users or is it just NewRelic that's seeing slowness/ping failure?
[openjck]: solarce: Some users have reported MDN being painfully slow
[solarce1]: ok
[openjck]: davidwalsh noticed that one of the pull requests changed the size of the images used in the "contributors" bar. Could generating the images at the new size be causing this effect?
[openjck]: groovecoder|pto: ^
[solarce1]: catchpoint measured response time has about doubled from like 1.7s to 3.5s
[groovecoder-pto]: openjck: I disabled the contributors bar
[openjck]: Ah, right right
[vectorvictor]: NR_ALERT: Alert escalated to downtime for -- Triggered by: unable to ping -- Apps currently involved:
[groovecoder-pto]: solarce: this is the only suspicious output I see from the last deployment log:
[solarce1]: yeah, that's no good, that means some child processes didn't stop cleanly and may be zombied and blocking things
[groovecoder-pto]: solarce: also
[solarce1]: i see a couple httpd processes on developer1 that are spinning, lots of cpu, nothing in strace
[groovecoder-pto]: solarce: can we just bounce it?
[solarce1]: we can, i'll stop, cleanup any thing left and start it
[groovecoder-pto]: solarce: cool, thanks.
[davidwalsh]: I'm not seeing anything in the changeset that's suspicious
[solarce1]: ugh
[solarce1]: $ pgrep httpd | wc -l [9:13:42]
[solarce1]: 78
[solarce1]: after i stop the service
[groovecoder-pto]: davidwalsh: yeah, I didn't see anything either
[groovecoder-pto]: solarce: 78?
[groovecoder-pto]: wow
[davidwalsh]: ?
[solarce1]: ok, cleaned up and started up
[groovecoder-pto]: solarce: cool, we'll keep an eye on it
[solarce1]: it looks more like 2 and 3 now, some spiky cpu but nothing pegged for a long time
[groovecoder-pto]: a zombie web-head left in the request rotation seems like it would explain the erratic behavior we saw ...
[solarce1]: groovecoder|pto: zombied web-heads, be sure to double tap
[groovecoder-pto]: solarce: lulz
[vectorvictor]: NR_ALERT: Alert downtime recovered for -- Triggered by: unable to ping -- Apps currently involved:
[davidwalsh]: I'm not geeky enough to understand what's up
[groovecoder-pto]: solarce: developer1 needs cardio
[davidwalsh]: Did something fluky happen or?
[groovecoder-pto]: davidwalsh: one of our web-heads was having issues
[solarce1]: davidwalsh: when apache restarted during the push it didn't do it cleanly and so a bunch of child processes were left running but not functional
[solarce1]: davidwalsh: we had to kill them off
[davidwalsh]: solarce: OK, thank you for the explanation
[davidwalsh]: I'm going to go back to my coloring book
[davidwalsh]: groovecoder|pto: Sorry again that you had to pop in, I know a better way to attack this now!
[groovecoder-pto]: solarce: what's our load-balancing method? round-robin per request? per time?
[solarce1]: groovecoder|pto: dumb round robin, the default
[groovecoder-pto]: solarce: k; just curious
[solarce1]: groovecoder|pto: is what zeus can do
[groovecoder-pto]: davidwalsh, openjck: I'm 90%+ sure that was the issue
[solarce1]: i concur
[openjck]: groovecoder|pto: webheads?
[groovecoder-pto]: davidwalsh, openjck: so I'm going back offline. text me if it comes back
[groovecoder-pto]: openjck: yeah, developer1
[openjck]: Yep
[openjck]: Cool
[groovecoder-pto]: openjck: I'll send an email to staff list
[openjck]: I'll keep an eye on it. Thanks groovecoder|pto! Thanks solarce!
[davidwalsh]: groovecoder|pto: We'll handle it
[davidwalsh]: Go away
[groovecoder-pto]: davidwalsh: no!
[groovecoder-pto]: :)
[cyliang]: solarce: Does anything else on dev1. need to be restarted to have it feed system-level data to NR?
[solarce1]: cyliang: i'm not sure, i think you apply the sysmond class and make sure the sysmon daemon is running
[solarce1]: i haven't actually deployed the sysmon stuff yet :)
[solarce1]: openjck: happy to help
[cyliang]: solarce: K. looks like the nrsysmond processes on dev1. are from Feb.27th; the ones on dev2 and dev3 are from March 14th.
[cyliang]: solarce: I will attempt to (kindly) kick the one on dev1.
[solarce1]: oh, you mean prod
[solarce1]: i thought you meant
[solarce1]: but yeah, swift kick
[solarce1]: make sure t stops cleanly
[cyliang]: Sorry, should have added more context to the short name.
[cyliang]: nrsysmond kicked on developer1.webapp.scl3
[openjck]: Hey all. Just as a heads-up, I may need to run out for an unplanned dr. appt today. I'll keep you in the loop.
[icaaq]: hey, was it the contributor bar that caused the sluggishness?
[openjck]: Nope, doesn't appear to be
[openjck]: We should re-enable it, actually
[icaaq]: openjck ok :)
[groovecoder-pto]: alright, now I'm off again
[mdnstagepush]: oh nice, dwalsh is pushing mdn-stage a01fd5e01b4aeb1bd56fa5b2504fc81ef0dff909
[mdnstagepush]: watching
[mdnstagepush]: ...a01fd5e01b4aeb1bd56fa5b2504fc81ef0dff909
[mdnstagepush]: Finished: update_code (6.544s)
[mdnstagepush]: the push is now going to the webheads!! (a01fd5e01b4aeb1bd56fa5b2504fc81ef0dff909 dwalsh)
[mdnstagepush]: Finished: update_info (3.817s), pre_update (10.361s)
[mdnstagepush]: Finished: update_assets (12.748s), update_locales (1.602s)
[mdnstagepush]: Finished: database (5.730s), update (20.081s), rsync_project (1.532s), checkin_changes (1.573s), deploy_app (2.046s)
[mdnstagepush]: dwalsh pushed mdn-stage a01fd5e01b4aeb1bd56fa5b2504fc81ef0dff909
[mdnprodpush]: oh god, dwalsh is pushing mdn a01fd5e01b4aeb1bd56fa5b2504fc81ef0dff909
[mdnprodpush]: watching
[mdnprodpush]: Finished: update_code (6.843s)
[mdnprodpush]: the push is now going to the webheads!! (a01fd5e01b4aeb1bd56fa5b2504fc81ef0dff909 dwalsh)
[mdnprodpush]: Finished: update_info (3.755s), pre_update (10.598s)
[mdnprodpush]: Finished: update_assets (13.301s)
[mdnprodpush]: Finished: update_locales (2.160s)
[mdnprodpush]: Finished: database (5.958s), update (21.420s), rsync_project (2.237s), checkin_changes (2.286s)
[mdnprodpush]: Finished: deploy_app (12.858s)
[vectorvictor]: NR_DEPLOYMENT: [] null.
[mdnprodpush]: dwalsh pushed mdn a01fd5e01b4aeb1bd56fa5b2504fc81ef0dff909
[sheppy]: openjck / davidwalsh / jezdez - I'm throwing out a scatter-shot question since the people I think most likely know aren't here today... How is permission for page move working right now? I have someone who has "Can move a tree of documents" permission that doesn't actually have the option in their menu.
[openjck]: I think (?) it's done through a Waffle flag?
[openjck]: I believe it's enabled for staff only atm
[sheppy]: Hm.
[sheppy]: Yeah, maybe.
[sheppy]: We need to start letting trusted localizers have it too.
[openjck]: I think we have a bug open for making it a real permission, though
[sheppy]: So if I set the "page_move" waffle to allow authenticated users do it, it would give it to everyone right now?
[sheppy]: Jeez.
[openjck]: To all authenticated users, yeah
[openjck]: I don't know the reason behind keeping the feature relatively locked down. Les might be the best person to speak on that.
[openjck]: Or ubernostrum
[openjck]: See also
[firebot]: Bug 675842 enh, P3, ---, nobody, NEW, Permission for moving pages
[sheppy]: Well, it's dangerous if the thing is buggy. :)
[stephend]: hey, does swarnava hang out here?
[openjck]: Right. And it might be -- I don't remember
[sheppy]: But it's working great and we are doing huge amounts of moving of content and the localizers want to keep up.
[openjck]: I don't think so...
[stephend]: hrm, k
[openjck]: stephend: ^
[openjck]: MHm
[stephend]: wanted to see if he'd do a fix to our test-suite
[sheppy]: hm.
[stephend]: for the privacy policy change he just made for yours :_)
[openjck]: sheppy: Yeah, I would be tempted to say we can open it, but I'm unfortunately not the best person to say one way or the other.
[sheppy]: Right
[davidwalsh]: This is a sweet revision!$compare?from=507875&to=507881
[davidwalsh]: sheppy: fscholz ^
[sheppy]: Yeah, isn't it though?
[sheppy]: Man, my internet connection is sluggish today.
[davidwalsh]: sheppy: Can't believe you guys have to deal with that crap
[sheppy]: icaaq / davidwalsh - btw, the contributor bar is looking teh sexay.
[sheppy]: davidwalsh: All. The. Time. :)
[davidwalsh]: That's all icaaq
[sheppy]: Actually, it's not too bad. We don't get as many of those as you'd think, as a percentage of our total edits.
[davidwalsh]: Though…I did add the padding and background/border to make it look contained
[davidwalsh]: So…yeah, still not taking any credit
[icaaq]: sheppy davidwalsh we're all doing it together :)
[sheppy]: davidwalsh: Here's another good one:$revision/538355
[davidwalsh]: hahaha
[davidwalsh]: He hacked us good
[sheppy]: Yeah, sooooo hard to do that.
[sheppy]: And I totally totally believe it was the "Afghan Cyber Army." :)
[sheppy]: And not some pimply 14-year-old kid.
[openjck]: lolol
[openjck]: I wonder what convinced them that this was abnormal. Did they find the $edit page through means other than... you know... hitting the edit button?
[sheppy]: :)
[openjck]: "I moved my car's mirrors out of their default position. Pssh, I'm kind of like Jeff Gordon."
[sheppy]: :)
[github]: [kuma] darkwing opened pull request #2310: Minor syntax updates to contributor stuff (master...minor-contrib-updates)
[github]: [kuma] openjck pushed 2 new commits to master:
[github]: kuma/master 2b9e26f David Walsh: Removing unneeded compat-importants
[github]: kuma/master 6e21009 John Karahalis: Merge pull request #2307 from darkwing/remove-compat-importants...
[davidwalsh]: Yusssssssss
[davidwalsh]: Once we get the migration done, our CSS is going to be soooooo small
[davidwalsh]: :D :D :D
[davidwalsh]: We probably have like 5k worth of "!important"
[openjck]: Hah, you may be right about that
[openjck]: The good news is that I plan to work on migration for the rest of today and for all of next week
[davidwalsh]: Yessss
[openjck]: TBH I have been doing a pretty good job ignoring it. Always hard to justify prioritizing it when everything works just fine without it.
[davidwalsh]: B.L.O.A.T.
[openjck]: But I'll bite the bullet on my other priorities for a week and just FDI.
[davidwalsh]: Bite the bullet … or I bite you
[openjck]: davidwalsh's bite
[davidwalsh]: haha
[davidwalsh]: Be back in like 30
[github]: [kuma] groovecoder opened pull request #2311: remove some duplicate method calls in doc template (master...small-doc-template-optimizations)
[groovecoder-pto]: rogerroger: sending pull 2311 on PTO like it aint no thang
[rogerroger]: I don't trust you, groovecoder|pto, are you identified with nickserv?
[groovecoder]: rogerroger: sending pull 2311 on PTO like it aint no thang
[rogerroger]: Ok, submitted status #10751 for
[davidwalsh]: groovecoder|pto: go away
[github]: [kuma] treyhunner opened pull request #2312: Use shorter EditorConfig file syntax (master...editorconfig-fix)
[icaaq]: davidwalsh I thought of this
[davidwalsh]: OK
[icaaq]: is there a performance gain of doing it like this?
[icaaq]: I'm not saying it's a bad thing just that it's harder to comprehend :)
[davidwalsh]: best practice, I guess?
[davidwalsh]: Is there a specific line?
[icaaq]: :285
[icaaq]: was the one i thought of :)
[davidwalsh]: haha
[davidwalsh]: I knew someone would talk jive there
[davidwalsh]: I don't know, I just prefer it
[icaaq]: davidwalsh it's not a big thing, but I think ternaries are harder to read
[davidwalsh]: If people tell me I'm wrong, then I'm wrong
[github]: [kuma] groovecoder closed pull request #2311: remove some duplicate method calls in doc template (master...small-doc-template-optimizations)
[icaaq]: why is groovecoder|pto doing stuff today? can someone ban him from github ;)
[icaaq]: davidwalsh also there's this
[davidwalsh]: /me shrugs
[davidwalsh]: Don't agree
[davidwalsh]: There's no question what that one line does
[icaaq]: davidwalsh no, I guess not, but if I lok at that line it takes longer for me to understand what it does than with the if statement. but maybe that's just me being slow :D
[davidwalsh]: You'r eprobably right
[github]: [kuma] openjck closed pull request #2312: Use shorter EditorConfig file syntax (master...editorconfig-fix)
[vectorvictor]: NR_ALERT: Alert opened for MDN Doc View -- Triggered by: 'MDN Doc View' Error rate > 0.5% -- Key transactions currently involved: MDN Doc View.