So here's my idea....
1- use top to see all the java processes and threads by CPU utilization (capital H displays threads). This will cause top to show the nlwp of for the thread in the PID column
Example (from top data):
[root@host ~]# top -H -n 3 -b |grep tomcat | grep java | sort -rn -k 9 | head -1
6638 tomcat 20 0 10.5g 2.9g 12m S 98.9 39.2 0:03.16 java
2- Us ps -L -utomcat to grep out the nlwp and get the PID of tomcat owning it.
Example: I'm grepping for the LWP id. The first number is the PID and the second is the LWP id:
[root@host ~]# ps -L -utomcat |grep java | grep 6638
27628 6638 ? 00:00:03 java
3- So now I have the PID of the java process and the LWP id of the bad thread. I can take a stack trace of java. The stack track records the LWP as NID in hex. So we convert the NID in hex to LWP and we have the LWP.
Do a kill -s SIGQUIT $tomcat_pid to the process to force a thread dump (which will write out to catalina.out for tomcat).[root@host ~]# kill -s SIGQUIT 27628
Then convert all the NID's to LWP's with a quick perl script that converts the hex to regular NLWP
(which I saved and named /tmp/convert-nid-to-lwp.pl).[root@host ~]# cat /usr/local/tomcat/default/logs/catalina.out.thread.dump.2013-10-09--13-54-49 | /tmp/convert-nid-to-lwp.pl > /usr/local/tomcat/default/logs/catalina.out.thread.dump.2013-10-09--13-54-49-nlwp
4- So now i have a thread dump with all the threads tagged by NLWP numbers. I find my NLWP dump in the thread dump and I've got a stack trace of the thread that's eating up all the CPU.