summaryrefslogtreecommitdiff
path: root/indra/llmessage
AgeCommit message (Collapse)Author
2020-10-01SL-14037 BugSplat Crash #646590: Enqueue failed in AISAndrey Kleshchev
2020-09-02SL-13891 Coroutine creation was requested on exitAndrey Kleshchev
2020-08-28SL-13555 'Second Life quit unexpectedly' error messageAndrey Kleshchev
2020-08-28SL-13811 Crash on coroprocedureAndrey Kleshchev
2020-08-20SL-13811 Crash on coroprocedureAndrey Kleshchev
Coroprosedure should stop on 'stop' exception
2020-08-18SL-13783 Workaround for enqueueCoprocedure() crash #2Andrey Kleshchev
2020-08-17Merged in SL-13783 and SL-13789Andrey Kleshchev
2020-08-17SL-13783 Workaround for enqueueCoprocedure() crash with asset storageAndrey Kleshchev
2020-08-11Merge branch 'DRTVWR-501-maint' into DRTVWR-503-maintAndrey Lihatskiy
# Conflicts: # indra/cmake/DirectX.cmake # indra/newview/llviewerparcelmedia.cpp # indra/newview/viewer_manifest.py
2020-07-29Merge branch 'DRTVWR-476' into DRTVWR-501-maintAndrey Kleshchev
2020-07-24SL-13679 Event pump DupListenerName crash at loginAndrey Kleshchev
2020-07-22Remove redundant LL_EXSTAT_ from enums.Nicky Dasmijn
2020-07-22LLExtStat had been a S32, this wasn't right, as some of the constants lead ↵Nicky Dasmijn
to integer overflow: const LLExtStat LL_EXSTAT_RES_RESULT = 2L<<30; const LLExtStat LL_EXSTAT_VFS_RESULT = 3L<<30; This shifts into the sign bit and clang gets (rightfully) upset about this. LLExtStatus needs to be at least of type U32 to remedy this problem, but while at it it makes sense to turn it into what it is: An enum. Turning it into a class enum has the added benefit we get type safety for mostly free. Which incidentally turned up a problem right away: A call to removeAndCallbackPendingDownloads had status and extstatus reversed and thus was wrong.
2020-05-27DRTVWR-476: Add "Socket" debug log output for socket operations.Nat Goodspeed
Enable the body of the existing ll_debug_socket() function (on Mac as well as Linux), but using tag "Socket" so you can turn on its log messages without emitting *all* debug messages.
2020-05-21DRTVWR-476: Support older compilers with LockMessageReader.Nat Goodspeed
2020-05-20DRTVWR-476: Fix LLCoprocedurePool::enqueueCoprocedure() shutdown crash.Nat Goodspeed
2020-05-19Make sure coproc gets destroyed after each iteration.Nicky Dasmijn
Making coproc scoped to the for loop will make sure the destructor gets called every loop iteration. Keeping it's scope outside the for loop means the pointer keeps valid till the next assigment that happens inside pop_wait_for when it gets assigned a new value. Triggering the dtor inside pop_wait_for can lead to deadlock when inside the dtor a coroutine tries to call enqueueCoprocedure (this happens). enqueueCoprocedure then will try to grab the lock for try_push but this lock is still held by pop_wait_for.
2020-05-19DRTVWR-476: Clean up reverting to boost::fibers::buffered_channel.Nat Goodspeed
2020-05-19DRTVWR-476: Revert "Use LLThreadSafeQueue, not boost::fibers::buffered_channel."Nat Goodspeed
This reverts commit bf8aea5059f127dcce2fdf613d62c253bb3fa8fd. Try boost::fibers::buffered_channel again with Boost 1.72.
2020-05-14DRTVWR-476, SL-12204: Fix crash in Marketplace Listings.Nat Goodspeed
The observed crash was due to sharing a stateful global resource (the global LLMessageSystem instance) between different tasks. Specifically, a coroutine sets its mMessageReader one way, expecting that value to persist until it's done with message parsing, but another coroutine sneaks in at a suspension point and sets it differently. Introduce LockMessageReader and LockMessageChecker classes, which must be instantiated by a consumer of the resource. The constructor of each locks a coroutine-aware mutex, so that for the lifetime of the lock object no other coroutine can instantiate another. Refactor the code so that LLMessageSystem::mMessageReader can only be modified by LockMessageReader, not by direct assignment. mMessageReader is now an instance of LLMessageReaderPointer, which supports dereferencing and comparison but not assignment. Only LockMessageReader can change its value. LockMessageReader addresses the use case in which the specific mMessageReader value need only persist for the duration of a single method call. Add an instance in LLMessageHandlerBridge::post(). LockMessageChecker is a subclass of LockMessageReader: both lock the same mutex. LockMessageChecker addresses the use case in which the specific mMessageReader value must persist across multiple method calls. Modify the methods in question to require a LockMessageChecker instance. Provide LockMessageChecker forwarding methods to facilitate calling the underlying LLMessageSystem methods via the LockMessageChecker instance. Add LockMessageChecker instances to LLAppViewer::idleNetwork(), a couple cases in idle_startup() and LLMessageSystem::establishBidirectionalTrust().
2020-05-06DRTVWR-476: Merge branch 'master' of lindenlab/viewer into DRTVWR-476-boost-1.72Nat Goodspeed
2020-05-05Merge branch 'DRTVWR-501-maint' into DRTVWR-503-maintAndrey Lihatskiy
# Conflicts: # indra/newview/llinventorybridge.cpp # indra/newview/llinventorypanel.cpp # indra/newview/lltexturectrl.cpp # indra/newview/skins/default/xui/de/floater_texture_ctrl.xml # indra/newview/skins/default/xui/es/floater_texture_ctrl.xml # indra/newview/skins/default/xui/fr/floater_texture_ctrl.xml # indra/newview/skins/default/xui/it/floater_texture_ctrl.xml # indra/newview/skins/default/xui/ja/floater_texture_ctrl.xml # indra/newview/skins/default/xui/pt/floater_texture_ctrl.xml # indra/newview/skins/default/xui/ru/floater_texture_ctrl.xml # indra/newview/skins/default/xui/tr/floater_texture_ctrl.xml # indra/newview/skins/default/xui/zh/floater_texture_ctrl.xml
2020-04-20Merge branch 'master' into DRTVWR-500Andrey Lihatskiy
# Conflicts: # indra/newview/pipeline.cpp
2020-03-25DRTVWR-476: Use LLThreadSafeQueue::close() to shut down coprocs.Nat Goodspeed
The tactic of pushing an empty QueuedCoproc::ptr_t to signal coprocedure close only works for LLCoprocedurePools with a single coprocedure (e.g. "Upload" and "AIS"). Only one coprocedureInvokerCoro() coroutine will pop that empty pointer and shut down properly -- the rest will continue waiting indefinitely. Rather than pushing some number of empty pointers, hopefully enough to notify all consumer coroutines, close() the queue. That will notify as many consumers as there may be. That means catching LLThreadSafeQueueInterrupt from popBack(), instead of detecting empty pointer. Also, if a queued coprocedure throws an exception, coprocedureInvokerCoro() logs it as before -- but instead of rethrowing it, the coroutine now loops back to wait for more work. Otherwise, the number of coroutines servicing the queue dwindles.
2020-03-25DRTVWR-476, SL-12197: Don't throw Stopping from main coroutine.Nat Goodspeed
The new LLCoros::Stop exception is intended to terminate long-lived coroutines -- not interrupt mainstream shutdown processing. Only throw it on an explicitly-launched coroutine. Make LLCoros::getName() (used by the above test) static. As with other LLCoros methods, it might be called after the LLCoros LLSingleton instance has been deleted. Requiring the caller to call instance() implies a possible need to also call wasDeleted(). Encapsulate that nuance into a static method instead.
2020-03-25DRTVWR-476: Introduce LLCoprocedureManager::close(). Use in tests.Nat Goodspeed
The new close(void) method simply acquires the logic from ~LLCoprocedureManager() (which now calls close()). It's useful, even if only in test programs, to be able to shut down all existing LLCoprocedurePools without having to name them individually -- and without having to destroy the LLCoprocedureManager singleton instance. Deleting an LLSingleton should be done only once per process, whereas test programs want to reset the LLCoprocedureManager after each test.
2020-03-25DRTVWR-476: Use LLThreadSafeQueue, not boost::fibers::buffered_channel.Nat Goodspeed
We've observed buffered_channel::try_push() hanging, which seems very odd. Try our own LLThreadSafeQueue instead.
2020-03-25DRTVWR-476: Manually count items in LLCoprocedurePool's pending queue.Nat Goodspeed
Reinstate LLCoprocedureManager::countPending() and count() methods. These were removed because boost::fibers::buffered_channel has no size() method, but since all users run within a single thread, it works to increment and decrement a simple counter. Add count information and max queue size to log messages.
2020-03-25DRTVWR-476: Use shared_ptr to manage lifespan of coprocedure queue.Nat Goodspeed
Since the consuming coroutine LLCoprocedurePool::coprocedureInvokerCoro() has been observed to outlive the LLCoprocedurePool instance that owns the CoprocQueue_t, closing that queue isn't enough to keep the coroutine from crashing at shutdown: accessing a deleted CoprocQueue_t is fatal whether or not it's been closed. Make LLCoprocedurePool store a shared_ptr to a heap CoprocQueue_t instance, and pass that shared_ptr by value to consuming coroutines. That way the CoprocQueue_t instance is guaranteed to live as long as the last interested party.
2020-03-25DRTVWR-476: Back out changeset 40c0c6a8407d ("final" LLApp listener)Nat Goodspeed
2020-03-25DRTVWR-476: Pump coroutines a few more times when we start quitting.Nat Goodspeed
By the time "LLApp" listeners are notified that the app is quitting, the mainloop is no longer running. Even though those listeners do things like close work queues and inject exceptions into pending promises, any coroutines waiting on those resources must regain control before they can notice and shut down properly. Add a final "LLApp" listener that resumes ready coroutines a few more times. Make sure every other "LLApp" listener is positioned before that new one.
2020-03-25DRTVWR-476: Terminate long-lived coroutines to avoid shutdown crash.Nat Goodspeed
Add LLCoros::TempStatus instances around known suspension points so printActiveCoroutines() can report what each suspended coroutine is waiting for. Similarly, sprinkle checkStop() calls at known suspension points. Make LLApp::setStatus() post an event to a new LLEventPump "LLApp" with a string corresponding to the status value being set, but only until ~LLEventPumps() -- since setStatus() also gets called very late in the application's lifetime. Make postAndSuspendSetup() (used by postAndSuspend(), suspendUntilEventOn(), postAndSuspendWithTimeout(), suspendUntilEventOnWithTimeout()) add a listener on the new "LLApp" LLEventPump that pushes the new LLCoros::Stopping exception to the coroutine waiting on the LLCoros::Promise. Make it return the new LLBoundListener along with the previous one. Accordingly, make postAndSuspend() and postAndSuspendWithTimeout() store the new LLBoundListener returned by postAndSuspendSetup() in a LLTempBoundListener (as with the previous one) so it will automatically disconnect once the wait is over. Make each LLCoprocedurePool instance listen on "LLApp" with a listener that closes the queue on which new work items are dispatched. Closing the queue causes the waiting dispatch coroutine to terminate. Store the connection in an LLTempBoundListener on the LLCoprocedurePool so it will disconnect automatically on destruction. Refactor the loop in coprocedureInvokerCoro() to instantiate TempStatus around the suspending call. Change a couple spammy LL_INFOS() calls to LL_DEBUGS(). Give all logging calls in that module a "CoProcMgr" tag to make it straightforward to re-enable the LL_DEBUGS() calls as desired.
2020-03-25DRTVWR-476: Re-enable an llcoproceduremanager_test case.Nat Goodspeed
Use new Sync class to make the driving logic wait for the coprocedure to run.
2020-03-25[DRTVWR-476] - temp fix to test. comment it out. access violation in releaseAnchor
2020-03-25[DRTVWR-476] - temp fix to a testAnchor
2020-03-25General cleanup. Delete commented out code.Nicky
2020-03-25Replace boost::fibers::unbuffered_channel with boost::fibers::buffered_channel.Nicky
Using boost::fibers::unbuffered_channel can block the mainthread when calling mPendingCoprocs.push (LLCoprocedurePool::enqueueCoprocedure) From the documentation: - If a fiber attempts to send a value through an unbuffered channel and no fiber is waiting to receive the value, the channel will block the sending fiber. This can happen if LLCoprocedurePool::coprocedureInvokerCoro is running a coroutine and this coroutine calls yield, resuming the viewers main loop. If inside the main loop someone calls LLCoprocedurePool::enqueueCoprocedure now push will block, as there's no one waiting for a result right now. The wait would be in LLCoprocedurePool::coprocedureInvokerCoro at the start of the while loop, but we have not reached that yet again as LLCoprocedurePool::coprocedureInvokerCoro did yield before reaching pop_wait_for. The result is a deadlock. boost::fibers::buffered_channel will not block as long as there's space in the channel. A size of 4096 (DEFAULT_QUEUE_SIZE) should be plenty enough for this.
2020-03-25[DRTVWR-476] - fix compiler errors 32 bit windows buildAnchor
2020-03-25Do not use string/chrono literals, sadly that won't work with GCC (4.9)Nicky
2020-03-25[DRTVWR-476] - update cef, fix mergeAnchor
2020-03-25[DRTVWR-476] - conflicts with a mac macroAnchor
2020-03-25Implemented some code review suggested cleanups.Brad Kittenbrink
2020-03-25Improved aggregate init syntax for DefaultPoolSizes map.Brad Kittenbrink
2020-03-25Improved shutdown behavior of LLCoprocedureManagerBrad Kittenbrink
2020-03-25First draft of boost::fibers::unbuffered_channel based implementation of ↵Brad Kittenbrink
LLCoprocedureManager
2020-03-25Lint fixes on new test file.Brad Kittenbrink
2020-03-25Attempt to close LLEventCoro's LLBoundListener connection when promise has ↵Brad Kittenbrink
been fulfilled.
2020-03-25Began work for adding a test covering LLCoprocedureManagerBrad Kittenbrink
2020-03-25SL-793: Use Boost.Fiber instead of the "dcoroutine" library.Nat Goodspeed
Longtime fans will remember that the "dcoroutine" library is a Google Summer of Code project by Giovanni P. Deretta. He originally called it "Boost.Coroutine," and we originally added it to our 3p-boost autobuild package as such. But when the official Boost.Coroutine library came along (with a very different API), and we still needed the API of the GSoC project, we renamed the unofficial one "dcoroutine" to allow coexistence. The "dcoroutine" library had an internal low-level API more or less analogous to Boost.Context. We later introduced an implementation of that internal API based on Boost.Context, a step towards eliminating the GSoC code in favor of official, supported Boost code. However, recent versions of Boost.Context no longer support the API on which we built the shim for "dcoroutine." We started down the path of reimplementing that shim using the current Boost.Context API -- then realized that it's time to bite the bullet and replace the "dcoroutine" API with the Boost.Fiber API, which we've been itching to do for literally years now. Naturally, most of the heavy lifting is in llcoros.{h,cpp} and lleventcoro.{h,cpp} -- which is good: the LLCoros layer abstracts away most of the differences between "dcoroutine" and Boost.Fiber. The one feature Boost.Fiber does not provide is the ability to forcibly terminate some other fiber. Accordingly, disable LLCoros::kill() and LLCoprocedureManager::shutdown(). The only known shutdown() call was in LLCoprocedurePool's destructor. We also took the opportunity to remove postAndSuspend2() and its associated machinery: FutureListener2, LLErrorEvent, errorException(), errorLog(), LLCoroEventPumps. All that dual-LLEventPump stuff was introduced at a time when the Responder pattern was king, and we assumed we'd want to listen on one LLEventPump with the success handler and on another with the error handler. We have never actually used that in practice. Remove associated tests, of course. There is one other semantic difference that necessitates patching a number of tests: with "dcoroutine," fulfilling a future IMMEDIATELY resumes the waiting coroutine. With Boost.Fiber, fulfilling a future merely marks the fiber as ready to resume next time the scheduler gets around to it. To observe the test side effects, we've inserted a number of llcoro::suspend() calls -- also in the main loop. For a long time we retained a single unit test exercising the raw "dcoroutine" API. Remove that. Eliminate llcoro_get_id.{h,cpp}, which provided llcoro::get_id(), which was a hack to emulate fiber-local variables. Since Boost.Fiber has an actual API for that, remove the hack. In fact, use (new alias) LLCoros::local_ptr for LLSingleton's dependency tracking in place of llcoro::get_id(). In CMake land, replace BOOST_COROUTINE_LIBRARY with BOOST_FIBER_LIBRARY. We don't actually use the Boost.Coroutine for anything (though there exist plausible use cases).
2020-03-25DRTVWR-494: Use std::thread::id for LLThread::currentID().Nat Goodspeed
LLThread::currentID() used to return a U32, a distinct unsigned value incremented by explicitly constructing LLThread or by calling LLThread:: registerThreadID() early in a thread launched by other means. The latter imposed an unobvious requirement on new code based on std::thread. Using std::thread::id instead delegates to the compiler/library the problem of distinguishing threads launched by any means. Change lots of explicit U32 declarations. Introduce LLThread::id_t typedef to avoid having to run around fixing uses again if we later revisit this decision. LLMutex, which stores an LLThread::id_t, wants a distinguished value meaning NO_THREAD, and had an enum with that name. But as std::thread::id promises that the default-constructed value is distinct from every valid value, NO_THREAD becomes unnecessary and goes away. Because LLMutex now stores LLThread::id_t instead of U32, make llmutex.h #include "llthread.h" instead of the other way around. This makes LLMutex an incomplete type within llthread.h, so move LLThread::lockData() and unlockData() to the .cpp file. Similarly, remove llrefcount.h's #include "llmutex.h" to break circularity; instead forward-declare LLMutex. It turns out that a number of source files assumed that #include "llthread.h" would get the definition for LLMutex. Sprinkle #include "llmutex.h" as needed. In the SAFE_SSL code in llcorehttp/httpcommon.cpp, there's an ssl_thread_id() callback that returns an unsigned long to the SSL library. When LLThread:: currentID() was U32, we could simply return that. But std::thread::id is very deliberately opaque, and can't be reinterpret_cast to unsigned long. Fortunately it can be hashed because std::hash is specialized with that type.